Dear Statalist,
I am using stata 9 and have a follow-up question to the above subject posted
in Oct. 2006. I am using a survey that asked whether a child was diagnosed
with ADHD (8% said yes, 92% said no), and if yes, asked whether the child
was taking medication. Hence, I wanted to know the appropriate way to code
the binary variable medication -medication-, which is currently missing for
92% of the sample. Hence, I created medication_nomissing where those 92%
observations are coded as 0.
gen medication_nomissing=medication
replace medication_nomissing=0 if adhd==0 /* this essentially changes
observations with missing medication to 0 */
gen valid= (adhd==1 & inlist(medication,0,1)) /* this was done per Jeff's
suggestion below and then implemented in model 3 */
I estimated these 4 models. Models 1, 2, and 3 report the same SEs for the
parameters and the same F, p for F, and dof for the model. Model 4 has
slightly different SEs and model statistics. The N, N_sub, and dof are shown
below.
/* model 1 */ svy: logit medication $independent
/* model 2 */ svy, subpop(if adhd==1): logit medication $independent
/* model 3 */ svy, subpop(if adhd==1): logit medication $independent if
valid
/* model 4 */ svy, subpop(if adhd==1): logit medication_nomissing
$independent
N N_sub dof
model 1 5424 n/a 5373
model 2 5424 5424 5373
model 3 5424 5424 5373
model 4 75916 5424 75865
It seems that only model 4 is appropriate since it initially includes the
entire sample. I'd appreciate your suggestions.
Thanks,
Brent Fulton
http://www.stata.com/statalist/archive/2006-10/msg00740.html
Re: st: different approaches to use only observations that have nonmissing
----------------------------------------------------------------------------
----