Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
AW: st: Resampling and compare full sample with subsamples
From
Johannes Thrul <[email protected]>
To
"[email protected]" <[email protected]>
Subject
AW: st: Resampling and compare full sample with subsamples
Date
Mon, 17 Mar 2014 19:47:39 +0000
Thank you Steve and sorry for the delayed response.
Could you do me a favor and explain briefly, why you would prefer confidence intervals over hypothesis testing in this case?
Thanks, Johannes
---------------------------------------------------------------------------------------------
Ah, you left out most of the detail; your explanation makes sense. To
answer your original question. You want to compare a part A to a whole C
But C = A U B, where B is the observations in C that are not in A. Let
pA and pB be the prevalnce rates in A and B and pW be the prevalence in
the whole. Then if nA, nB, and n are the sample sizes of A,B, and
(*) pC = W pA + (1 - W) pB where W = nA/n.
(**) pC - pA = (1-W)(pB - pA).
A one-sample test comparing A to C is not correct, C is itself a random
sample and pC and pA are correlated. as A is a SRS random sample of C
without replacement, B is also a SRS, pA and pB are slightly negatively
correlated becaus because of (*)
If pA and pB are different, then pA and pC are different (and
vice-versa). Looking at (**) you can see that the proper test is a
*two-sample* test that compares pA and pB. The standard error is
computed under the null hypothesis, and without a finite population
correction. (Cochran, 1977, problem 2.16, p. 48). I myself think that
confidence intervals are preferable to hypothesis tests here.
Reference: Cochran, W. G. (1977). Sampling techniques (3rd ed.). New
York: Wiley.
Steve
[email protected]
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/