Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Margaret MacDougall <Margaret.MacDougall@ed.ac.uk> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Bootstrap sampling for evaluating hypothesis tests |
Date | Sat, 16 Mar 2013 08:21:29 +0000 |
Dear MaartenThanks for so kindly offering such a comprehensive reply. I look forward to exploring your suggestions.
Best wishes Margaret ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Dr Margaret MacDougall Medical Statistician and Researcher in Education Centre for Population Health Sciences University of Edinburgh Medical School Teviot Place Edinburgh EH8 9AG Tel: +44 (0)131 650 3211 Fax: +44 (0)131 650 6909 E-mail: Margaret.MacDougall@ed.ac.uk http://www.chs.med.ed.ac.uk/cphs/people/staffProfile.php?profile=mmacdoug On 14/03/2013 10:28, Maarten Buis wrote:
On Wed, Mar 13, 2013 at 4:04 PM, Margaret MacDougall wrote:I would value receiving recommendations on literature explaining the application of bootstrap sampling to assess robustness to Type I errors of a proposed new hypothesis test. Better still, if the recommended references contain corresponding computer syntax!In terms of literature references, I would look at bootstrap tests. A bootstrap changes the data such that the null hypothesis is true and looks at the proportion of replictions in which the test statistic is more extreme than the one observed in the original data. In bootstrap tests these can be used as an estimate of the p-value(*), but you can compare it with the asymptotic p-value returned by your tests and see if they correspond. It is useful to also consider the Monte Carlo confidence interval, which captures the variability you can expect in the proportion due to the fact that it is based on a random process. Say you find 1000 out of 20000 replications in which the test statistic was more extreme than the one in the original sample, than the Monte Carlo confidence interval can be computed by typing in Stata: -cii 20000 1000- If you save the p-values from all replications you can look at the distribution of the p-values, as I did in the examples I gave earlier. Nice introductions to bootstrap tests can be found in Chapter 4 of (Davison & Hinkley 1997) and Chapter 16 of (Efron & Tibshirani 1993). They are both good introductory texts, but I found that they complement one another well, so it is useful to look at both of them. You can also find more Stata code examples in the manual entry of -bootstrap-: Under "Remarks" go to the section titled "Achieved significance level", and it will give an example of how to use -bootstrap- to do a bootstrap test. Hope this helps, Maarten A.C. Davison and D.V. Hinkley (1997) Bootstrap Methods and their Applications. Cambridge: Cambridge University Press. B. Efron and R.J. Tibshirani (1993) An Introduction to the Bootstrap. Boca Raton: Chapman & Hall/CRC (*) Alternatively, for testing purposes it makes sense to use the ( number of replictions in which the test statistic is more extreme than the one observed in the original data + 1 ) / ( the number of replications +1 ), see: Chapter 4 of (Davison & Hinkley 1997). Though for a large number of replications the difference with the simple proportion is trivial. --------------------------------- Maarten L. Buis WZB Reichpietschufer 50 10785 Berlin Germany http://www.maartenbuis.nl --------------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/
-- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/