Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Interpreting Shapiro-Wilk-Test (swilk)

From	Nick Cox <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: st: Interpreting Shapiro-Wilk-Test (swilk)
Date	Tue, 10 Sep 2013 12:03:15 +0100

This is what Shapiro-Wilk and more generally any significance test
does. It answers the question; is there enough evidence for
non-normality to overthrow the null hypothesis, and the answer in your
case is yes. But with a sample size that big, even unimportant
deviations from normality end  up significant. The -qnorm- evidence is
likely to be more compelling. If you want a slightly more formal
procedure check out -qenv- from SSC.

It's perhaps more important to be clear as far as possible that your
assumption of functional form is correct, e.g. by plotting residuals
versus each predictor.

Nick
[email protected]


On 10 September 2013 11:18, Christian Schroetel
<[email protected]> wrote:
> Dear statistics and/or Stata experts,
>
> actually, I'm used to find out or clear up things myself using FAQs,
> helps or such. But this time, I'm quite confused so I decided to use
> the chance to get help via Statalist. It's my first time, though.
>
> For my master thesis, I'm performing several regressions on panel data
> with up to 13 independent variables (xtreg). To check for normality of
> the residuals, I did the following:
>
> - xtreg sgrowth l.sgrowth l.slnsales slnage scfratio  srdintensity
> sleverage spersonalpremium sintangibles sinternationalsales sroa
> stobinsq sclr scurrentratio, fe -
> - predict r, ue -
> - kdensity r, normal -
> - iqr r -
> - swilk r -
>
> The kdensity graphics for the combined resiudal indicate quite some
> normality, iqr doesn't show any severe outliers and only less than
> 0.5% mild outliers indicating a quite symmetric distribution of the
> residuals as well. Also, pnorm and qnorm deliver decent graphics
> indicating normality. But now I get the following result with swilk:
>
>
>     Variable |    Obs       W           V         z       Prob>z
> -------------+--------------------------------------------------
>            r |   2830    0.99739      4.245     3.723    0.00010
>
> As far as I know, a high value of W should indicate normality, so,
> again, that would say I've got normally distributed residuals.
> Nevertheless, the p-value indicates rejection of the null hypothesis
> of normality.
> So, now I wonder whether I'm doing something wrong or whether I should
> just not pay too much attention to the p-value. It's just that I'd
> like to say in my thesis that the Shapiro-Wilk test indicates
> normality, which I probably couldn't the way i is now. Respectively,
> what could be the reason for the low p-value combined with a high
> value for W?
>
> I'd appreciate any thoughts, comments or help on that issue and thank
> you in advance for your efforts.
>
> Best regards
>
> Christian
>
> Btw: I tried all the same with - predict r, e - (as I ddin't exactly
> know which way to test for normality of the residuals) and - sfrancia
> r - delivering similar results.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: Population averaging in panel data and applying the hurdle model in panel data
  - From: Neil Hewitt <[email protected]>

References:
- st: Interpreting Shapiro-Wilk-Test (swilk)
  - From: Christian Schroetel <[email protected]>

Prev by Date: st: R: Interpreting Shapiro-Wilk-Test (swilk)
Next by Date: Re: st: RE: Features for Stata 14
Previous by thread: st: R: Interpreting Shapiro-Wilk-Test (swilk)
Next by thread: st: Population averaging in panel data and applying the hurdle model in panel data
Index(es):
- Date
- Thread