Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: why don't confidence intervals from -proportion- use the same formula as -ci-?
From
Austin Nichols <[email protected]>
To
[email protected]
Subject
Re: st: why don't confidence intervals from -proportion- use the same formula as -ci-?
Date
Fri, 11 Jan 2013 12:38:31 -0500
Ronan Conroy <[email protected]> :
This has been discussed many times over the years on Statalist, with
the usual advice being: don't do that. If you want CIs on
proportions, or to test differences in proportions, you probably want
to use -svy:tab- (and if you don't have a survey start with -svyset,
srs-). See also
http://www.stata.com/statalist/archive/2010-05/msg00569.html
for the case when even -svy- commands fail to appropriately constrain
proportions to [0,1].
On Fri, Jan 11, 2013 at 6:44 AM, Ronan Conroy <[email protected]> wrote:
> I have a real problem with the confidence intervals produced by the -proportion- command.
>
> . input outcome freq
>
> outcome freq
> 1. 0 21
> 2. 1 2
> 3. end
>
>
> Here is the confidence interval which is most probably closest the the nominal coverage according to
> - Brown L, Cai T, DasGupta A. Interval estimation for a binomial proportion. Statistical Science. 2001;16(2):101–17.
>
> . ci outcome [fw=freq], bin wil
>
> ------ Wilson ------
> Variable | Obs Mean Std. Err. [95% Conf. Interval]
> -------------+---------------------------------------------------------------
> outcome | 23 .0869565 .0587534 .02418 .2679598
>
>
>
> Now here is what -proportion- does.
>
>
> . proportion outcome [fw=freq]
>
> Proportion estimation Number of obs = 23
>
> --------------------------------------------------------------
> | Proportion Std. Err. [95% Conf. Interval]
> -------------+------------------------------------------------
> outcome |
> 0 | .9130435 .0600739 .7884579 1.037629
> 1 | .0869565 .0600739 -.037629 .2115421
> --------------------------------------------------------------
>
> .
> end of do-file
>
> According to the manual:
>
>
> "Methods and formulas
> proportion is implemented as an ado-file.
> Proportions are means of indicator variables; see [R] mean."
>
> Is anyone prepared to defend this approach as the only formula implemented by -proportion-? Or indeed to tell me that they have managed to publish a paper that included confidence intervals such as the one above?
>
>
> I myself find this bizarre. Consider the example above. The confidence interval includes a value that is impossible - zero. With two observed successes, the success rate cannot be zero. And it includes probabilities that have no definition: negative probabilities. While I am prepared to accept that physicists have now produced temperatures that are lower than absolute zero, I cannot bring myself to persuade anyone that a confidence interval for a probability can extend beyond the interval 0-1.
>
>
> I believe it would be good if Stata's -proportion- command allowed the choice of some more believable methods.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/