Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: RE: RE: -robvar- and number of degrees of freedom
From
Nick Cox <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
st: RE: RE: RE: -robvar- and number of degrees of freedom
Date
Fri, 3 Sep 2010 12:37:02 +0100
Another take on this is that -robvar- is doing what you are asking for, but what is misleading is that -tabulate- (called by -robvar-) is not showing the missing categories on the by-variable.
However, it is common practice elsewhere that missings on a by-variable are excluded by default (and, at least sometimes, that you would need to opt explicitly for missings to be included).
I think -robvar- needs a fix either way.
Nick
[email protected]
Nick Cox (2)
An even simpler alternative, while we await StataCorp's comment, is to exclude the missings explicitly:
. robvar mpg if !missing(rep78), by(rep78)
Nick Cox (1)
Well spotted.
My guess is that -robvar- is not marking out missings on the -by()- variable, which are falsely included in the calculation of groups. If you pursue a simple experiment the correct answer follows.
. drop if missing(rep78)
(5 observations deleted)
. robvar mpg, by(rep78)
Repair | Summary of Mileage (mpg)
Record 1978 | Mean Std. Dev. Freq.
------------+------------------------------------
1 | 21 4.2426407 2
2 | 19.125 3.7583241 8
3 | 19.433333 4.1413252 30
4 | 21.666667 4.9348699 18
5 | 27.363636 8.7323849 11
------------+------------------------------------
Total | 21.289855 5.8664085 69
W0 = 5.8525980 df(4, 64) Pr > F = 0.00044531
W50 = 4.0610367 df(4, 64) Pr > F = 0.00537416
W10 = 6.1590202 df(4, 64) Pr > F = 0.00029485
I've not looked at all the code but my guess is that
marksample touse
should be followed by something like
markout `touse' `by', strok
but naturally you should not make this change on the official -robvar-. At most, clone -robvar- and check whether this works.
Nick
[email protected]
Garry Anderson
The -robvar- command does not seem to be reporting the correct degrees
of freedom.
. webuse auto
(1978 Automobile Data)
. robvar mpg,by(rep78)
Repair | Summary of Mileage (mpg)
Record 1978 | Mean Std. Dev. Freq.
------------+------------------------------------
1 | 21 4.2426407 2
2 | 19.125 3.7583241 8
3 | 19.433333 4.1413252 30
4 | 21.666667 4.9348699 18
5 | 27.363636 8.7323849 11
------------+------------------------------------
Total | 21.289855 5.8664085 69
W0 = 4.7219575 df(5, 68) Pr > F = 0.00092356
W50 = 3.2906157 df(5, 68) Pr > F = 0.01014559
W10 = 4.9744717 df(5, 68) Pr > F = 0.00061062
I would have expected the 5 df to be 4 df because there are 5 groups.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/