The formula for estimating the sample size based on the width of a
confidence interval for a proportion is:
n = (z^2 * p * q)/(d)^2, where z is the alpha level and d is the one-sided
difference between p and the upper (or lower) limit. For example, if you
expect a proportion of .35, and you want to be 95% sure that p is no larger
than .40 (given that p is .35). So with z=1.96, p=.35, q=.65, and d=(.40 -
.35) or .05, your sample size is about 350.
Interestingly, you can get this using the -cii- command.
.cii 350 .35, the difference though is that these confidence intervals are
exact.
In your case, I wouldn't use the normal approximation to the binomial
because your proportion is quite rare. You could use -cii- and try
different sample sizes, while maintaining the same proportion, e.g.,
.cii 1000 1, will give a wide confidence interval
.cii 10000 10, will get you a narrower one.
(Note these are exact limits, not normal approximations)
What sampsi does is model both alpha error and beta error. As long as you
don't specify a value for an alternative hypothesis (i.e., all you are
interested in is interval estimation) you don't need to model beta error.
Paul
-----Original Message-----
From: Don Spady [mailto:[email protected]]
Sent: Monday, April 28, 2003 1:07 PM
To: [email protected]
Subject: st: Re: RE: Sample size
Paul
Thanks for your reply. Indeed I want to estimate prevalence, with
the interval being from 0.0001 to 0.0003 or there abouts. I was told
that the prevalence of the disease was between 1:200 and 1:2000,
possibly closer to the 1:200. By shooting at 0.0005, I would get the
worst case scenario. The confidence interval is hard to guess (say the
real value is 1:200 and I test for 1:2000, how do I estimate a
confidence interval) If the presence or absence follows a poisson
distribution, then the variance is 1:2000 and the SD is 0.0224, I think.
Does this make much sense.
Don
----- Original Message -----
From: "VISINTAINER PAUL" <[email protected]>
To: <[email protected]>
Sent: Monday, April 28, 2003 10:09
Subject: st: RE: Sample size
> Don,
>
> The problem you are having with sample size is that you haven't given
enough
> information. It isn't clear whether you want to simply estimate the
> prevalence/incidence of a condition in the population; whether you
want to
> "test" whether the occurrence in the population is really .001, or
whether
> you want to test the difference between groups, assuming the
occurrence in
> general is .001. The last two options require you to specify an
alternative
> hypothesis, which you haven't given.
>
> Using your sampsi input, you are specifying a comparison between a
> prevalence of 1 per 1000 vs. none (or a really very, very rare
prevalence).
> In this case you're specifying that the null value is .001 and your
> alternative is that it is much more rare than that. If you reverse
your
> figures (e.g., sampsi 0 .001, p(.8)) you're specifying that the null
value
> is near 0 and your alternative hypothesis is that it is much more
prevalent.
>
>
> (I was actually surprised that sampsi performed the calculation with 0
as an
> entry. I suppose it actually uses a very small value for 0.)
>
> For the first option, you rather just estimate the prevalence of this
> condition, (which you think is pretty rare at .001), you might want to
focus
> on the precision of the estimate by specifying the width of the
confidence
> interval. I don't think we can get a sample size estimate based on
the
> width of a confidence interval using sampsi.
>
> So, what do you want to do?
>
> Paul
>
>
> -----Original Message-----
> From: Don Spady [mailto:[email protected]]
> Sent: Monday, April 28, 2003 11:19 AM
> To: Statalist
> Subject: st: Sample size
>
> Dear all
> I sent this before but got no response. I have revised it.
> I want to estimate the sample size needed to detect an disease that
> occurs in 1 out of 1000 people (as an example). The alternate
> state is absence of disease which would occur in 999 of 1000 people on
> average. The problem is that I get numbers but I don't know if they
> are the
> right ones. Can I use sampsi grp1 being those with disease and Grp2
> being
> those without disease. Or do I use sampsi 0.001, onesample as in:
>
> sampsi 0.001 0, p(0.8) onesample
>
> I need help and thank in advance those that provide it.
>
> Donald Spady
> Dep't of Pediatrics, University of Alberta
> (780) 407-1244
>
> Nature has no reset button.
>
> Donald Spady
> Dep't of Pediatrics, University of Alberta
> (780) 407-1244
>
> Nature has no reset button.
>
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/