Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Vector degrees of freedom in Stata


From   "Newson, Roger B" <[email protected]>
To   <[email protected]>
Subject   RE: st: Vector degrees of freedom in Stata
Date   Wed, 11 Apr 2007 16:28:39 +0100

I was assuming that the term "degrees of freedom" was defined in the
sense of Satterthwaite (1946). In that reference, "degrees of freedom"
is a shorthand for "twice the inverse squared sampling coefficient of
variation of the squared sample standard error itself". The
homoskedastic normal model has the strange feature (when you think of
it) that this quantity is not only the same for all parameters in the
model, but also an integer. In the more general case, however, we expect
the degrees of freedom to be a vector, especially when one parameter is
the mean of a smaller sample from a more variable subpopulation, and
another parameter is the mean of a larger sample from a less variable
subpopulation. And, according to the definitive study of Moser et al.
(1989) and Moser and Stevens (1992), the Satterthwaite-corrected t-test
produces more reliable confidence intervals than anybody had any right
to expect, at smaller sample sizes than anybody had any right to expect,
at least when both samples are sampled from Normal subpopulations.

Thanks to Stas for pointing us to some possible alternative
interpretations of the term "degrees of freedom".

Best wishes

Roger


References

Moser BK, Stevens GR, Watts CL. The two-sample t-test versus
Satterthwaite's approximate F-test. Communications in Statistics -
Theory and Methods 1989; 18(11): 3963-3975.

Moser BK, Stevens GR. Homogeneity of variance in the two-sample means
test. The American Statistician 1992; 46(1): 19-21.

Satterthwaite FE. An approximate distribution of estimates of variance
components. Biometrics Bulletin 1946; 2(1): 110-114.


Roger Newson
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: [email protected] 
www.imperial.ac.uk/nhli/r.newson/

Opinions expressed are those of the author, not of the institution.

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Stas
Kolenikov
Sent: 11 April 2007 14:47
To: [email protected]
Subject: Re: st: Vector degrees of freedom in Stata

Well a different line of thought would be towards scalar degrees of
freedom arguing that # of clusters is too conservative, while # of
observations is too liberal, so the effective degrees of freedom are
somewhere it between. I know there's been thinking along those lines
in the mixed models literature, motivated for instance by the need to
come up with some reasonable numbers to go into the information
criteria for model selection, but I have not been following that
literature.

Yet another view on the effective degrees of freedom is suggested in
the statistical learning and data mining literature where they show
that the effective degrees of freedom for a non/semi-parametric model
can be found as E(\partial \hat y_i/\partial y_i) = Cov( \hat y_i,
y_i), which can be checked to coincide with linear regressions
identities like # variables = trace[ X(X'X)^{-1}X' ]. See Hastie,
Tibshirani and Friedman's book on statistical learning (2001), or Ye
(JASA, 1998: http://www.citeulike.org/user/ctacmo/article/574999).

On 4/11/07, Newson, Roger B <[email protected]> wrote:
> Fellow Statalisters (especially StataCorp):
>
> At the German Stata Users' Group Meeting at Mannheim in 2006, whose
Web
> page is at
> http://ideas.repec.org/s/boc/dsug06.html
> Bobby Gutierrez gave a very interesting talk on -xtmixed-,
downloadable
> at
> http://ideas.repec.org/p/boc/dsug06/05.html
> which ended with a summary of possible future developments to watch
out
> for in future versions. One of these possible developments mentioned
was
> "Degrees of freedom calculations". This seems to indicate that
somebody
> at StataCorp is thinking of offering alternative degrees of freedom
> formulas to the standard e(N_clust)-1 or e(N_clust)-colsof(e(b)), at
> least for -xtmixed-.


-- 
Stas Kolenikov
http://stas.kolenikov.name
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index