Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Resampling and compare full sample with subsamples
From
Steve Samuels <[email protected]>
To
[email protected]
Subject
Re: st: Resampling and compare full sample with subsamples
Date
Mon, 17 Mar 2014 19:35:46 -0400
Johannes:
Do I really think that there is *exactly* zero difference in the
prevalence rates from two parts of your population? No, I don't. To my
mind, the important question is "how different?". This question is the
one addressed by confidence intervals. Also it is the question you
appeared to ask when you stated that the purpose of your analysis is to
"give me an idea of what losing certain kinds of schools means for the
reliability of prevalence figures in other survey waves."
Steve
[email protected]
On Mar 17, 2014, at 3:47 PM, Johannes Thrul <[email protected]> wrote:
Thank you Steve and sorry for the delayed response.
Could you do me a favor and explain briefly, why you would prefer confidence intervals over hypothesis testing in this case?
Thanks, Johannes
---------------------------------------------------------------------------------------------
Ah, you left out most of the detail; your explanation makes sense. To
answer your original question. You want to compare a part A to a whole C
But C = A U B, where B is the observations in C that are not in A. Let
pA and pB be the prevalnce rates in A and B and pW be the prevalence in
the whole. Then if nA, nB, and n are the sample sizes of A,B, and
(*) pC = W pA + (1 - W) pB where W = nA/n.
(**) pC - pA = (1-W)(pB - pA).
A one-sample test comparing A to C is not correct, C is itself a random
sample and pC and pA are correlated. as A is a SRS random sample of C
without replacement, B is also a SRS, pA and pB are slightly negatively
correlated becaus because of (*)
If pA and pB are different, then pA and pC are different (and
vice-versa). Looking at (**) you can see that the proper test is a
*two-sample* test that compares pA and pB. The standard error is
computed under the null hypothesis, and without a finite population
correction. (Cochran, 1977, problem 2.16, p. 48). I myself think that
confidence intervals are preferable to hypothesis tests here.
Reference: Cochran, W. G. (1977). Sampling techniques (3rd ed.). New
York: Wiley.
Steve
[email protected]
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/