Ineta Sokolowski <[email protected]> wrote:
> Can anybody explain, why the one-way ANOVA (-loneway-) and -xtsum- gives
> different standard deviations for within and between effects of a
> variable?
>
> . loneway diagnosis gpnum
> . xtsum diagnosis, i(gpnum)
>
> where "diagnosis" is a dichotomous variable (0=no illness, 1=illness)
> and "gpnum" is the general practitioners (GP) number (38 different
> numbers). Each GP has different number of patients (between 44 and 111).
>
> How are the SD calculated in each procedure?
>
-xtsum- and -loneway- provide different summaries of the data. -xtsum- is
summarizing the overall variable, the between transformed variable and the
within transformed variable. The reported standard deviations are the
estimated standard deviations for the transformed variables. In contrast,
-loneway- provides a one-way analysis of variance decomposition of the
specified variable. The formula for computing these standard deviations are
standard in the ANOVA literature and documented in [R] loneway. It is
interesting to note that the reported standard deviations correspond to the
variance components in a constant only model.
Let's consider the case of -xtsum- in more detail. Since the manual does not
go into great detail, I will. Let
_
_ _
ytilde_it = y_it - y_i + y
be the within transformed variable,
where
y_it are the observations on the specified
variable in group i at time t,
_
y_i is the mean of y_it over the observations in group i,
and
_
_
y is the overall mean of y.
The reported within standard deviation is the estimated standard deviation
of ytilde.
For the between model, the reported standard deviation is the estimated
_
standard deviation of the n group means y_i.
Since the formulas for computing the standard deviations reported by
-loneway- are given in [R] loneway, I will not repeat them here. Still, for
those who think in -xt- terms, it is interesting to note that the between
standard deviation is an estimate of the standard deviation of the
individual level effect in a random-effects models in which the only
regressor is a constant. Furthermore, the reported within standard
deviation is an estimate of the standard deviation of the idiosyncratic
error in a random-effects model in which the only regressor is a constant.
I hope that this helps.
David
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/