Earlier, I wrote:
>>The exact interval used by -ci, binomial- is the Clopper-Pearson interval,
>>but you must realize that "exact" is a bit of a misnomer. It is exact in the
>>sense that it uses the binomial distribution as the basis of the calculation.
>>However, the binomial distribution is a discrete distribution and as such its
>>cumulative probabilities will have discrete jumps, and thus you'll be hard
>>pressed to get (say) exactly 95% coverage.
to which Constantine Daskalakis <[email protected]> responds:
> I do not think this is correct. For the CI, it is the parameter space, not
> the sample space, that matters (and the former is continuous). In other
> words, if we have k successes out of N trials, we are looking for limits
> {p_l, p_u}, such that
> Pr [K <= k | p_l] = a/2
> and
> Pr [K >= k | p_u] = a/2
> In general, there exist such limits that correspond to tail probabilities of
> (exactly) a/2. The fact that the sample space is highly discrete (when N is
> small) has nothing to do with it. The only exception is when the observed
> number of successes is either 0 or N; in that case, one limit is on the
> boundary of the parameter space (p_l=0 or p_u=1) and the corresponding tail
> probability on that side is exactly 0, not a/2 (as the manual correctly
> points out).
Constantine is correct that there are exact solutions to the above equations.
However, the problem is that there are only N+1 possible CIs that can be
generated from a binomial experiment of N trials.
Consider the case where N=9. There are ten possible outcomes of the binomial
experiment, namely zero successes, one success, ..., nine successes. Since
there are only ten possible k's (0,1,...,9), there are only ten possible
confidence intervals, namely
. cii 9 0, exact
-- Binomial Exact --
Variable | Obs Mean Std. Err. [95% Conf. Interval]
-------------+---------------------------------------------------------------
| 9 0 0 0 .3362671*
. cii 9 1, exact
-- Binomial Exact --
Variable | Obs Mean Std. Err. [95% Conf. Interval]
-------------+---------------------------------------------------------------
| 9 .1111111 .1047566 .0028091 .4824965
...
. cii 9 9, exact
-- Binomial Exact --
Variable | Obs Mean Std. Err. [95% Conf. Interval]
-------------+---------------------------------------------------------------
| 9 1 0 .6637329 1*
Some of these ten intervals cover p, and some don't. The probability of
coverage is then the cumulative sum (with respect to the binomial(9,p)
distribution) of the probabilities of the k's that result in intervals that
cover p. This, I state, is where it is difficult to make the coverage
probability equal 95% exactly.
--Bobby
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/