Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: sktest interpretation


From   "M. Haider Hussain" <[email protected]>
To   [email protected]
Subject   Re: st: RE: sktest interpretation
Date   Thu, 8 Dec 2005 09:33:00 +0500

Thank you very much Dr. Cox for such an elaborate answer. It certainly
helps me.

Haider Hussain
Social Policy and Development Center (SPDC)
Karachi, Pakistan.


On 12/7/05, Nick Cox <[email protected]> wrote:
> This is a fairly common question on Statalist.
>
> Missings are irrelevant to -sktest-, and
> are just ignored, so that is no problem. However,
> the fact that you got missings may or may not
> indicate some much deeper problem, but that's
> for you to consider.
>
> -sktest- is here rejecting a null hypothesis
> of normality. With your sample sizes, this is
> totally unsurprising. You are being told that
> your sample is large enough to distinguish
> between "genuine" non-normality and "apparent"
> non-normality that is just the sampling
> fluctuation that would occur if the underlying distribution
> really were normal. However, with your
> sample sizes, the kind of non-normality at
> which -sktest- squawks would not necessarily
> trouble any data analyst with experience.
>
> It is salutary to cycle through the numeric
> variables in Stata's auto data and look at -sktest-
> results. Here n is much smaller than yours at n = 74
> but -sktest- often reports rejection on what
> graphical analysis will reveal as an unproblematic
> distribution. For example, -sktest- may reject if a
> variable is shorter-tailed than normal.
> It may reject if a variable is somewhat
> irregular in distribution, but otherwise
> not problematic. In a word, it is typically
> over-sensitive for the practical problem.
>
> Any test in this area still leaves the question
> of measuring, or more generally assessing,
> the kind of non-normality you have and of
> deciding whether non-normality is really a
> problem for what you are doing. A direct
> calculation of moments (or alternative
> measures such as L-moments) is sometimes
> helpful here.
>
> The issue of -sktest- versus a Jarque-Bera
> test is also secondary. Jarque-Bera typically
> seems to mean using asymptotic sampling distributions
> for skewness and kurtosis for a problem
> in which they are often a poor approximation.
> (Also, Jarque and Bera just reinvented a very old
> test. Why they got credit for that is mysterious,
> except on the hypothesis that people have no
> time for proper reading.) -sktest- is, more or less,
> Jarque-Bera done better with adjustments for sample size.
> My guess would be that it would make no difference
> in your case.
>
> Graphical examination of your residuals
> with -qnorm- will teach you far more about
> their (non-)normality than a -sktest-. The
> only practical reason for using -sktest-
> is whenever that you are obliged to use it
> by instruction from someone in power over you,
> namely an advisor, boss, reviewer or journal editor.
>
> Another detail is that -sktest- does not know
> that your variable is a residual and makes no
> adjustment for that fact. A wild guess is that
> this is just a purist issue in your case.
>
> Nick
> [email protected]
>
> M. Haider Hussain
>
> > Sorry for such a novice-level question.
> >
> > I ran an ols regression with 15 estimators and 14831 observations. In
> > this process, 437 missing values were generated. Then I tested
> > normality of the residual using sktest and it returned following
> > output.
> >
> > Variable |  Pr(Skewness)   Pr(Kurtosis)  adj chi2(2)    Prob>chi2
> > --------------------------------------------------------------
> > -------------------------------
> >     ewhe |        0.000             0.000               .
> >            .
> >
> > whereas, sktest with noadjust option returned the following output
> >
> > Variable |  Pr(Skewness)   Pr(Kurtosis)  adj chi2(2)    Prob>chi2
> > --------------------------------------------------------------
> > -------------------------------
> >     ewhe |        0.000             0.000           3693.33
> >     0.0000
> >
> >
> > Where're the statistics of chi2 in the first instance? Does it mean
> > that sktest (without no adjust) is sensitive to the missing values?
> > Can I use jb test with 14000+ observations? If not than what other
> > "quantitative" tests are available?
> > (Or am I misinterpreting something?)
>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index