I can't see your variable to comment but these
results don't surprise me.
If you
sysuse auto
foreach v of var price-gear {
qui swilk `v' if foreign
di "`v' {col 20}" %4.3f r(p)
}
you will get this:
price 0.004
mpg 0.495
rep78 0.293
headroom 0.940
trunk 0.809
weight 0.026
length 0.813
turn 0.996
displacement 0.083
gear_ratio 0.013
If you then follow up, as you did,
with say -qnorm- then -- even
with a sample size this low, 22,
chosen to be of the same order as
your example -- you will see that
a low P-value can correspond to
variables which look as if they should
be transformed and variables which,
to be sure, don't look exactly normal
but would probably not be problematic
for -anova-. In short "looks as if it
isn't normal" is not the same as "looks
as if it would be problematic".
In any case I would put more emphasis
on choosing response scale on scientific
or substantive grounds than because of this
normality assumption (which, additionally,
is about errors, not responses). The
manual entry [R] diagnostic plots points
to Rupert Miller's book, which is excellent
reading for this area.
One of many merits of -glm- is that it lets you decouple the
question of response scale and error distribution.
Nick
[email protected]
Karamjit Shad
> Prior to carrying out an anova I tested my data for normality
> and some of
> the data was non-normal. Ladder suggested a log
> transformation would be
> suitable. I then checked the transformed data using swilk and
> the data is
> still non-normal. However sfrancia indicates that it is normal.
> . swilk igg60 if group==3
>
> Shapiro-Wilk W test for normal data
> Variable | Obs W V z Prob>z
> -------------+-------------------------------------------------
> igg60 | 30 0.74827 8.001 4.300 0.00001
>
> . swilk ligg60 if group==3
>
> Shapiro-Wilk W test for normal data
> Variable | Obs W V z Prob>z
> -------------+-------------------------------------------------
> ligg60 | 30 0.91745 2.624 1.995 0.02305
>
> . sfrancia ligg60 if group==3
>
> Shapiro-Francia W' test for normal data
> Variable | Obs W' V' z Prob>z
> -------------+-------------------------------------------------
> ligg60 | 30 0.93170 2.398 1.600 0.05479
>
> a qnorm plot shows the data to "gently" oscillate about the normal
> distribution but nothing that would worry me too much.
> My question is what test should I use for testing for
> normality in this
> situation - or should I just use a non-parametric analysis.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/