Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: bootstrapping two-part model and svy
From
Sarah Beth Link <[email protected]>
To
[email protected]
Subject
st: bootstrapping two-part model and svy
Date
Fri, 17 Sep 2010 23:20:15 -0400
Dear Stata users,
I am using a two-part model in stata 10. The first part of the model
is a logit that predicts the probability of a positive outcome and the
second part of the model is a GLM estimated only on observations with
positive outcomes (in this case expenditures). The final outcome I am
interested in is the average treatment effect of a dummy variable. I
would like to bootstrap the entire program to get the 95% confidence
interval on the average treatment effect. I was able to successfully
bootstrap the following:
bootstrap "two_part_glm_model" r(sate), reps(1000) dots noesample
where "two_part_glm_model" is an ado file with the following code:
program define two_part_glm_model,rclass
svy: logit $posy $x
predict phat0, p
replace tx=1
predict phat1, p
replace tx=txtrue
svy:glm $y $x if asumexp>0, family(gamma) link(log)
replace tx=0
predict exphat0, mu
replace tx=1
predict exphat1, mu
replace tx=txtrue
gen pred2pt0=phat0*exphat0
gen pred2pt1=phat1*exphat1
gen ate=pred2pt1-pred2pt0
sum ate
return scalar sate=r(mean)
drop phat* exphat* pred2pt* ate*
end
However, I would like to use the survey mean of the average treatment
effect (ate) in the program. Which brings me to my first question --
does it make sense to use bootstrap to get the confidence interval on
the average treatment effect and use svy: mean to take into account
the survey weighting? I understand that the bootstrap command
performs random sampling that does not take into account the survey
characteristics. If not, then is my only other option ignoring the
survey characteristics?
I attempted the following code, which is very similar to the above:
bootstrap "svy_two_part" r(sate), reps(2) dots noesample
where "svy_two_part" is an ado file with the following code:
program define svy_two_part,rclass
svy: logit $posy $x
replace tx=0
predict phat0, p
replace tx=1
predict phat1, p
replace tx=txtrue
svy:glm $y $x if asumexp>0, family(gamma) link(log)
predict xbeta, mu
replace tx=0
predict exphat0, mu
replace tx=1
predict exphat1, mu
replace tx=txtrue
gen pred2pt0=phat0*exphat0
gen pred2pt1=phat1*exphat1
gen ate=pred2pt1-pred2pt0
svy: mean ate
estat sd
return scalar sate=r(mean)
drop phat* exphat* pred2pt* ate*
end
I get the following error message
"phat0 already defined
command -> svy_two_part
an error occurred when command was executed on original dataset
please rerun bootstrap and specify the trace option to see a trace of the
commands bootstrap executed
r(110);"
This brings me to my second question -- do you know why I would be
getting the above error message? I used trace and could not figure it
out. I also ran the program separately within the do file and did not
find any errors.
I will greatly appreciate your comments, suggestions, and advice.
Thank you very much,
Sarah Beth
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/