| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: analysis of continuous gestational age
There is *no* assumption in linear regression that the *data*
be normally distributed -- the estimates are unaffected by
this. However, if you want to trust the confidence intervals
and/or the p-values, then the residuals must be normally
distributed. There is no necessary relationship between the
distribution of the residuals and the distribution of the
data -- how else could one use "dummy" variables?
Rich Goldstein
Svend Juul wrote:
Alo wrote:
What about the idea that we can use linear regression even if the
residuals are
not normally distributed if we have a large dataset? Is there any basis
for
this?
-----------------------------------------------------------------------
No, not in the school I went to. You might be thinking of the fact that
with
large datasets even unimportant deviations from normality become
significant, so you should not use significance testing to decide
whether the deviation is important, but rather graphical inspection.
Regardless of dataset size: Gestational age data are not from a normal
distribution; they deviate a lot from that assumption.
Svend
__________________________________________
Svend Juul
Institut for Folkesundhed, Afdeling for Epidemiologi
(Institute of Public Health, Department of Epidemiology)
Vennelyst Boulevard 6
DK-8000 Aarhus C, Denmark
Phone: +45 8942 6090
Home: +45 8693 7796
Email: [email protected]
__________________________________________
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/