Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Maarten Buis <maartenlbuis@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: sign test output |
Date | Thu, 17 Jan 2013 13:14:39 +0100 |
On Thu, Jan 17, 2013 at 11:21 AM, Nahla Betelmal wrote: > from my readings in statistics , I know that in order to decide > whether to use parametric or non-parametric tests, the data normality > distribution should be checked first. > > Shapiro-Wilk is used to test normality, when the number of > observations is less than 30. Otherwise, we should use > Kolmogorov-Smirnov for large sample (as in my sample). Unfortunately that is incorrect. Normality tests need huge samples before the p-value means what it is supposed to mean. An analogy I have heard in a different context, but which applies to this situation very well is: to go out to sea in a row boat to check whether the sea is safe for the QE II. Using a normality test with only 346 observations is not a good idea. Nick and I discussed the issue of the performance of tests for Gaussianity recently on Statalist: http://www.stata.com/statalist/archive/2012-09/msg01040.html http://www.stata.com/statalist/archive/2012-09/msg01013.html The bottom line was: you need at least somewhere between 10,000 and a 100,000 observations before the tests we discussed (Jarque-Bera and Doornik-Hansen) perform somewhat acceptably, but in such large datasets you need to worry whether deviations from Gaussianity that are statistically significant are also substantively significant. I have addepted the simulation from the discussion above for the Kolmogorov-Smirnov test. It shows that the Kolmogorov-Smirnov test does not perform acceptably for any of these sample sizes. *------------------- begin simulation ------------------- clear all program define sim, rclass drop _all set obs `=1e5' gen double x = rnormal() forvalues i = 2/5 { sum x in 1/`=1e`i'' ksmirnov x = normal((x-r(mean))/r(sd)) return scalar p`i' = r(p) return scalar p_cor`i' = r(p_cor) } end simulate p2p=r(p2) p2c=r(p_cor2) /// p3p=r(p3) p3c=r(p_cor3) /// p4p=r(p4) p4c=r(p_cor4) /// p5p=r(p5) p5c=r(p_cor5) /// , reps(2e4): sim gen id = _n reshape long p2 p3 p4 p5, i(id) j(dist) string label var p2 "N=100" label var p3 "N=1,000" label var p4 "N=10,000" label var p5 "N=100,000" gen byte distr = cond(dist=="p",1,2) label define distr 1 "p-value" /// 2 "corrected p-value", replace label value distr distr simpplot p?, by(distr) scheme(s2color) legend(cols(4)) *-------------------- end simulation -------------------- (For more on examples I sent to the Statalist see: http://www.maartenbuis.nl/example_faq ) This simulation needs the -simpplot- package in order to run. This can be downloaded by typing in Stata -ssc install simpplot-. > So, when the test accepts the null (normality), A statistical tests never accepts a null hypothesis; it can only fail to reject the null hypothesis. This may sound pedantic, but the difference is important: In the case of non-significance you don't have evidence that the null hypothesis is wrong, but an absence of evidence is not the same thing as evidence of absence. > Also, for the comment about robust, I meant exactly what said (I used > the robust term loosely) It is probably best to avoid the term robust, since it has a very specific meaning in statistics. Actually, to make it more confusing, it has multiple specific meanings. -- Maarten --------------------------------- Maarten L. Buis WZB Reichpietschufer 50 10785 Berlin Germany http://www.maartenbuis.nl --------------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/