Nick is bang on about the unsuitability (non-suitability? anti-suitability?)
of testing the kernel density for the purposes of making conclusions about
the original variable. After all, the kernel density is a mathematical
construct based on a host of assumptions that determine its form. Clearly,
to suppose that a test of normality on the values of the kernel density
would have much to say substantively on the subject of the normality of the
original variable would be erroneous. But, suppose, as I do, that
Alejandro's reason for testing the kernel density isn't to make conclusions
about the normality of the original variable, but wants to determine how
close the kernel density approaches the normal distribution. Then, wouldn't
he be able to gather some useful information from -qnorm- and -sktest-?
Best,
Lee
Lee Sieswerda, Epidemiologist
Thunder Bay District Health Unit
999 Balmoral Street
Thunder Bay, Ontario
Canada P7B 6E7
Tel: +1 (807) 625-5957
Fax: +1 (807) 623-2369
[email protected]
www.tbdhu.com
> -----Original Message-----
> From: Nick Cox [SMTP:[email protected]]
> Sent: Monday, September 09, 2002 9:24 AM
> To: [email protected]
> Subject: st: RE: RE: distribution of the kernel density
>
> Alejandro Riano
>
> > > Someone knows if there is an statistic which I could use
> > to asses if the
> > > kernel density for a given variable follows a normal
> > distribution ?
>
> Lee Sieswerda
> >
> > How about:
> > kdensity var1, gen(x y)
> > qnorm y
> > sktest y
>
> The second variable which -kdensity- generates
> is a density, which is not a suitable input
> for -qnorm- or -sktest-.
>
> Setting that aside, passing
> data through a kernel density estimation
> command can add no information suitable
> for _any_ formal test of normality or
> non-normality, and at best it obscures
> the testing issue. To see this, note that
> if you choose a Gaussian kernel and
> increase its width, inevitably the
> "estimated" distribution approaches Gaussian
> form. More generally, any test would depend
> on which kernel you choose
> and on its width as much as on the
> sample data, which makes no sense.
>
> I support Lee's notion of using -qnorm-, but
> on the original data. Another plot
> which deserves a brief experiment
> is -dpplot- on SSC.
>
> For reasons often rehearsed on
> this list, -swilk-, -sfrancia-
> and -sktest- are in practice almost
> always less illuminating than graphs.
>
> Nick
> [email protected]
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/