--- Brendan <[email protected]> wrote:
> I am working with a dataset containing 30000 observations. Some of
> the explanatory variables are continuous. If I perform usual tests
> for normality the numbers are too great for swilk or for sfrancia,
> and if I use sktest the result is "absurdly" large values and rejects
> the hypothesis of normal distribution. The frequency histogram,
> cumulative frequency plot and normal plot all look normal with no
> outliers. I presume that with such large numbers even very small
> deviations from normal will lead to a significant result. The box-
> tidwell test indicates that the model relationship is linear for all
> these continuous variables. Is it safe to ignore the sktest results?
An explanatory variable can have a linear or non-linear effect on the
log odds when it is non-normally distributed and a linear or non-linear
effect on the log odds when it is normally distributed. Normality of
the explanatory variable has nothing to do with whether or not it has a
linear effect. So the test were irrelevant for your purpose to begin
with.
-- Maarten
-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands
visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434
+31 20 5986715
http://home.fsw.vu.nl/m.buis/
-----------------------------------------
___________________________________________________________
Yahoo! Answers - Got a question? Someone out there knows the answer. Try it
now.
http://uk.answers.yahoo.com/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/