-sktest- was mentioned on the list.
This raised my interest once again in the
question of the utility of these tests
for normality (Gaussianity).
The main original idea seems to have
been that, when sampling from a Gaussian,
skewness - 0
------------
its se
and
kurtosis - 3
------------
its se
are themselves unit Gaussian. Hence
the sum of the squares of these statistics
is chi-square with 2 df. However, these
are large sample results and the
convergence on Gaussianity of the
individual statistics can be very slow.
-sktest- builds in various adjustments
for sample size. In contrast, the
user-written -jb6- uses the large
sample results regardless. As said earlier,
it is difficult to justify the use of -jb6-. I guess
it continues to be downloaded from SSC
because the buzzwords "Jarque-Bera test"
are familiar to some groups and it is
not fully realised that the official -sktest-
is a much better program.
However, my main concern is how
these tests behave with real data and how
they might help with their analysis.
Reaching for the auto data, we can
get a condensed display of -sktest-
results by
foreach v of var price-for {
qui sktest `v' if rep78 < .
di as txt "`v'{col 20}" ///
as res %5.3f r(P_skew) " " ///
%5.3f r(P_kurt) " " ///
%5.3f r(P_chi2)
}
price 0.000 0.009 0.000
mpg 0.001 0.081 0.004
rep78 0.833 0.747 0.929
headroom 0.471 0.033 0.082
trunk 0.872 0.039 0.112
weight 0.664 0.013 0.050
length 0.780 0.004 0.023
turn 0.794 0.079 0.192
displacement 0.043 0.201 0.063
gear_ratio 0.309 0.021 0.051
foreign 0.005 0.000 0.000
It is instructive now to cycle through
a series of -qnorm- and/or -histogram-s
and also to calculate skewness and kurtosis
themselves.
One interesting variable is -trunk-,
which is discussed in [R] sktest,
where it is stated that the tails
are too thick (the kurtosis is too
high). As my old economics teacher
used to say, "Even Homer sometimes nods".
Looking at graphs, and also at the moments,
using a private domain program, shows that
this interpretation is backwards:
. moments price-for
-------------------------------------------------------------
n = 69 | mean SD skewness kurtosis
-------------+-----------------------------------------------
price | 6146.043 2912.440 1.688 5.032
mpg | 21.290 5.866 0.995 3.997
rep78 | 3.406 0.990 -0.057 2.678
headroom | 3.000 0.853 0.197 2.144
trunk | 13.928 4.343 -0.044 2.159
weight | 3032.029 792.852 0.118 2.073
length | 188.290 22.747 -0.076 2.000
turn | 39.797 4.441 0.071 2.228
displacement | 198.000 93.148 0.581 2.354
gear_ratio | 2.999 0.463 0.279 2.109
foreign | 0.304 0.464 0.850 1.723
-------------------------------------------------------------
-trunk- in fact has _low_ kurtosis -- it is short- or light-tailed --
and the P-value reflects the fact that the test statistic,
a sum of squares, is constructed to measure
non-normality, and in particular kurtosis differing from 3 in either
direction.
-sktest- is doing what is designed to do, but in practice one
kind of deviation from normality (skewness and/or heavy
tails) is much more likely to be problematic than the
other (short/light tails).
The moral is very simple and perhaps too elementary:
these tests can easily be misinterpreted if you don't also
look at the data.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/