Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: correct confidence intervals of -mean- ?


From   Dirk Enzmann <[email protected]>
To   [email protected]
Subject   st: Re: correct confidence intervals of -mean- ?
Date   Sat, 06 Mar 2010 15:01:10 +0100

Kit,

it is not the s.e.(mean) that pose a problem here but the df used by invttest(). I could understand if -mean- would calculate a z-test where the df are irrelevant (although then the CIs would be not useful for small samples). But it calculates a t-test, and using the correct df is the essence of a t-test.

Dirk

Kit Baum wrote:
> Dirk said
>
> Very carefully I want to ask: Are the confidence intervals given by
> -mean- really correct?
>
> Below I compare the results of -mean- with the results of a different
> procedure:
>
> and goes on to show that -mean- CIs can be reproduced by collapsing,
> but maintaining the DF in the confidence interval as that of the whole
> sample. These are the same standard errors of mean reported by
>
> tabstat price,by(rep78) stat(mean sd n semean)
>
> He wonders whether the DF used in calculating s.e.(mean) should be
> that of the full sample. I think that -mean- and -tabstat- are both
> using the notion that you have a model y = mu + \epsilon, where
> var(\epsilon} is a population parameter. Thus the variance of \epsilon
> is a constant for all subsamples, and when you calculate s.e. mean,
> you use the sqrt of that common variance and divide by the sqrt(sample
> size) of the subpopulation.  You can see that is being done by
> -tabstat- by comparing the sd, n and semean columns.
>
> What does surprise me is that the CIs generated by these methods
> differ so widely from those computed by
>
> reg price i.rep78
> margins rep78
>
> The differences are not just a small-sample/large-sample adjustment of
> the Root MSE. If you take apart the VCE of a regression of price on
> all five dummies, no constant term, you find a diagonal matrix
> containing the inverses of the respective sample sizes, so the
> difference has to lie in the computation of \hat{sigma^2} which
> multiplies inv(X'X).

*************************************************
Dr. Dirk Enzmann
Institute of Criminal Sciences
Dept. of Criminology
Schlueterstr. 28
D-20146 Hamburg
Germany

phone: +49-(0)40-42838.7498 (office)
       +49-(0)40-42838.4591 (Mrs Billon)
fax:   +49-(0)40-42838.2344
email: [email protected]
www: http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html
*************************************************
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index