Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Resampling and compare full sample with subsamples

From	Steve Samuels <[email protected]>
To	[email protected]
Subject	Re: st: Resampling and compare full sample with subsamples
Date	Mon, 17 Mar 2014 19:35:46 -0400


Johannes:

Do I really think that there is *exactly* zero difference in the
prevalence rates from two parts of your population? No, I don't. To my
mind, the important question is "how different?". This question is the
one addressed by confidence intervals. Also it is the question you
appeared to ask when you stated that the purpose of your analysis is to
"give me an idea of what losing certain kinds of schools means for the
reliability of prevalence figures in other survey waves."


Steve
[email protected]


On Mar 17, 2014, at 3:47 PM, Johannes Thrul <[email protected]> wrote:

Thank you Steve and sorry for the delayed response. 

Could you do me a favor and explain briefly, why you would prefer confidence intervals over hypothesis testing in this case?

Thanks, Johannes





---------------------------------------------------------------------------------------------

Ah, you left out most of the detail; your explanation makes sense. To
answer your original question. You want to compare a part A to a whole C
But C = A U B, where B is the observations in C that are not in A. Let
pA and pB be the prevalnce rates in A and B and pW be the prevalence in
the whole. Then if nA, nB, and n are the sample sizes of A,B, and

(*) pC = W pA + (1 - W) pB where W = nA/n.

(**) pC - pA = (1-W)(pB - pA).

A one-sample test comparing A to C is not correct, C is itself a random
sample and pC and pA are correlated. as A is a SRS random sample of C
without replacement, B is also a SRS, pA and pB are slightly negatively
correlated becaus because of (*)

If pA and pB are different, then pA and pC are different (and
vice-versa). Looking at (**) you can see that the proper test is a
*two-sample* test that compares pA and pB. The standard error is
computed under the null hypothesis, and without a finite population
correction. (Cochran, 1977, problem 2.16, p. 48). I myself think that
confidence intervals are preferable to hypothesis tests here.

Reference: Cochran, W. G. (1977). Sampling techniques (3rd ed.). New
York: Wiley.


Steve
[email protected]

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Resampling and compare full sample with subsamples
  - From: Johannes Thrul <[email protected]>
- Re: st: Resampling and compare full sample with subsamples
  - From: Steve Samuels <[email protected]>
- AW: st: Resampling and compare full sample with subsamples
  - From: Johannes Thrul <[email protected]>
- Re: st: Resampling and compare full sample with subsamples
  - From: Maarten Buis <[email protected]>
- AW: st: Resampling and compare full sample with subsamples
  - From: Johannes Thrul <[email protected]>
- Re: st: Resampling and compare full sample with subsamples
  - From: Nick Cox <[email protected]>
- AW: st: Resampling and compare full sample with subsamples
  - From: Johannes Thrul <[email protected]>

Prev by Date: st: new package -smvcir- available in SSC
Next by Date: st: summarizing data for each panel over chosen time windows
Previous by thread: AW: st: Resampling and compare full sample with subsamples
Next by thread: Re: st: Resampling and compare full sample with subsamples
Index(es):
- Date
- Thread