There is a superb review paper at
Brown, L.D., Cai, T.T., DasGupta, A. 2001. Interval estimation for a
binomial
proportion. Statistical Science 16: 101-133.
This should be accessible to many, if not all, Statalist members at
<http://projecteuclid.org/DPubS/Repository/1.0/Disseminate?handle=euclid
.ss/1009213286&view=body&content-type=pdf_1>
Nick
[email protected]
Maarten buis
Actually, exact confidence intervals are not as exact as the name
suggests, especially in the case of small proportions. These confidence
interval tends to be conservative, see: (Agresti 2002, pp. 18-19) and
the simulation below. If the exact method where truely exact in all
regards, than the proportion of 95% confidence intervals containing the
true proportion should be .95. In actual fact the proportion is higher,
this is what I mean with the interval being conservative.
*--------------- begin example ----------------------------
set more off
capture program drop sim
program define sim, rclass
drop _all
set obs 1000
gen x = uniform()<.99
ci x, binomial
return scalar correct = r(lb)<.99 & r(ub)>.99
end
simulate correct=r(correct), reps(10000): sim
sum correct
*------------------- end example --------------------------
(For more on how to use examples I sent to the Statalist, see
http://home.fsw.vu.nl/m.buis/stata/exampleFAQ.html )
The reference you seem to refer to is:
Agresti, A. and B.C. Coull (1998) "Approximate is better than exact for
interval estimation of binomial parameters" The American Statistician,
pp. 119--126.
Alan Agresti (2002) "Categorical Data Analysis", 2nd edition, Wiley.
Hope this helps,
Maarten
--- "Lachenbruch, Peter" <[email protected]> wrote:
> For small proportions, the exact option is useful. It is the
> standard that the other methods hope to reach. Coverage is exact.
> Agresti and Coull have a nice paper (I don't remember the
> attribution, but I think it's American Statistician, somewhere
> around 2000).
Nick Cox
> The "correct" CI for a binomial variable is a matter of dispute.
>
> In your case you are looking for a CI around a point estimate of
> 0.029.
>
> A symmetric CI around such a point estimate is likely to include 0
> and some negative values unless the sample size is very, very large.
>
> Some people just truncate the interval at 0, but a more defensible
> procedure is to work on a transformed scale and back-transform, or do
>
> something approximately equivalent that yields positive endpoints
> for the CI with about the right coverage. [R] ci has several pointers
> to the literature.
>
> Alternative CIs can be got in this way:
>
> . gen rep78_1 = rep78 == 1
> . ci rep78_1 if rep78 < ., binomial jeffreys
> . ci rep78_1 if rep78 < ., binomial Wilson
>
> Nick
> [email protected]
>
> Martin Weiss
>
> try this in Stata:
>
>
> ************************
> sysuse auto, clear
> proportion rep78
> matrix define A=e(b)
> matrix define B=e(V)
> count if rep78!=.
> *Upper/Lower Bound for proportion of "1"
> di A[1,1]+invnormal(1-0.05/2)*sqrt(A[1,1]*(1-A[1,1])/`r(N)')
> di A[1,1]-invnormal(1-0.05/2)*sqrt(A[1,1]*(1-A[1,1])/`r(N)')
> *Standard Error for "1"
> *Mistake obviously there...
> di sqrt(A[1,1]*(1-A[1,1])/`r(N)')
> ************************
>
>
> Then let me know: why do I not hit the correct CI for the proportion
> of
> "1"
> in the repair record? Something`s wrong with the standard error, I do
> not
> know what, though...
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/