Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: statistical test to compare two survey means from two estimatingequations


From   [email protected]
To   [email protected]
Subject   Re: st: statistical test to compare two survey means from two estimatingequations
Date   Tue, 5 Dec 2006 15:00:07 -0500

(1) Regarding the difference in p-values, all this seems to suggest is 
that the -test- being implemented using your (first) procedure is the same 
as that being implemented after -svy:reg- and -suest- and is not the same 
as the -test- being implemented after -svy:logit-.   If you reran the 
first -suest- and -test- commands while using svy:reg to compare the two 
subgroups, rather than one subgroup to the total, you'd get exactly the 
same results as your first procedure.  To the extent that the current 
-suest & test-  results  for -svy:reg- are almost identical to your first 
test seems quite reasonable.  So the question seems to be: For the 
comparison at hand using a dichotomous outcome, is the -test- following 
the first two procedures more appropriate than the -test- following the 
last procedure (syvy:logit)?  Had the outcome been continuous, the issue 
of gross differences in p-values would likely have not come up.


(2) Regarding the following questions:

"I reiterate my original concern and ask if there is no "statistical 
difference between MI and non-MI," but there is a "statistical difference 
between MI and the nation" (the
USA, being the one that contains Statacorp) or vice versa, what should we 
conclude? "

I doubt that your scenario would arise in practice, but wouldn't it depend 
on the proportion of the total sample represented by MI?

Nonetheless, there are cases where it makes sense to compare a subgroup of 
cases to all cases rather than the remaining cases.  The pattern of 
results would likely be opposite to what you propose in your question. 
Let's say we want to look whether nonresponse in a survey leads to 
nonresponse bias in  one's estimate of a proportion in the population, say 
proportion with affective disorders.  Nonresponse is a necessary but not 
sufficient condition for nonresponse bias, so we need to test this.  So 
does nonresponse lead to nonresponse bias in our estimate based on those 
who participated? 

Let's say that 10% of the selected sample doesn't participate, but we do 
some intensive follow-up with a sample of nonrespondents to get an 
estimate of the proportion.  To see if nonresponse bias exists, we *do 
not* want to compare the following:

 p(affective disorder | respondents) vs. p(affective disorder | 
nonrespondents)

It may very well be that the proportion with affective disorders is 
significantly higher among nonrespondents than among respondents.  But 
this does not address the issue of whether nonresponse bias exists.  What 
we want to compare is:

 p(affective disorder | respondents & nonrespondents) vs. p(affective 
disorder | respondents)

This may not be significantly (statistically or practically) different, 
even though the first comparison is, thereby suggesting no significant 
nonresponse bias. 


Now let's say that 35% of the selected sample doesn't participate.  We may 
find that both comparisons are significantly different:

 p(affective disorder | respondents) vs. p(affective disorder | 
nonrespondents)

 p(affective disorder | respondents & nonrespondents) vs. p(affective 
disorder | respondents)

Nonetheless, it is only the latter comparison that directly addresses the 
issue of nonreponse bias.








"Austin Nichols" <[email protected]> 
Sent by: [email protected]
12/05/2006 01:21 PM
Please respond to
[email protected]


To
[email protected]
cc

Subject
Re: st: statistical test to compare two survey means from two estimating 
equations






In summary:

Brent Fulton <[email protected]> asked How can one "compare the
survey-based means" for a subpop to the whole pop?
I <[email protected]> advised him to compare the subpop to the
balance of the pop.
Michael Frone <[email protected]> wrote "How about -suest-
followed by -test-"

But note that the various options outlined can lead to different
answers, as demonstrated below. I reiterate my original concern and
ask if there is no "statistical difference between MI and non-MI," but
there is a "statistical difference between MI and the nation" (the
USA, being the one that contains Statacorp) or vice versa, what should
we conclude?

webuse nhanes2
local m=35
svy, subpop(if age<=`m'): tab diab bl, col se
gen p21=.
gen p22=.
mat li e(b)
mat b1=e(b)
local a1=b1[1,3]
local b1=b1[1,4]
test p21=p22
local p1=r(p)
svy, subpop(if age<=`m'): reg diab
estimates store a1
svy, subpop(if age<=`m' & bl==1): reg diab
estimates store a2
suest a1 a2
mat b2=e(b)
local a2=b2[1,1]
local b2=b2[1,2]
test [a1]_cons=[a2]_cons
local p2=r(p)
svy, subpop(if age<=`m'): logit diab
estimates store a3
svy, subpop(if age<=`m' & bl==1): logit diab
estimates store a4
suest a3 a4
mat b3=e(b)
local a3=invlogit(b3[1,1])
local b3=invlogit(b3[1,2])
test [a3]_cons=[a4]_cons
local p3=r(p)
foreach v in a b p {
if "`v'"=="a" di "Diabetes" in gr " svy: tab  svy: reg  svy:logit" _c
if "`v'"=="a" di in gr _n " NB/All " _c
if "`v'"=="b" di in gr _n " Black  " _c
if "`v'"=="p" di in gr _n " p-value" _c
forv i=1/3 {
di in ye _col(`=`i'*10') ``v'`i'' _c
}
}
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index