Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: DHS Womens Data Survey Setup
From
melissa daniels <[email protected]>
To
[email protected]
Subject
st: DHS Womens Data Survey Setup
Date
Sat, 16 Jul 2011 00:23:09 -0500
Hello fellow stata-users,
I am working on an analysis of DHS women's data (Ghana, 2008) using
STATA 11.2. My sample includes only women with infants in the 0-23 month age
range. DHS data are collected as a two-stage stratified sample of households.
I want to identify all necessary survey vars I may need and use proper
dataset construction for a survey analysis. I am still constructing the
dataset, but am planning to use the following variables (as defined in
DHS recode 5)
and survey set statement.
gen psu = v021 *this variable indicates enumeration areas for the survey.
gen strata1 = v022 *this variable defines pairing or groupings of primary
sampling units using in taylor series expansion
gen strata2=v023 *this variable indicates the sample domain, or the basic
geographic units wherein the sample was self-weighted.
gen m_weight=v005/10^6 *(decimal correction as directed by DHS) this
variable includes probability weights for the sample.
svyset: psu (pweight=m_weight), strata(strata1)
I have a couple questions:
1) I understand variance estimation is based on the taylor series expansion
method, so I assume v022 (strata1 above) is the strata var
I am most interested in. In what cases would the sample domain var v023 be
of use to me? Is it important for survey estimation?
2) I believe I need data on the full sample of women in order to estimate
corrected variances on the subset of women I am interested in. Does
that mean I need to create
my dataset with all women, or all individuals in the larger dataset?
Or is my dataset complete since the
subsample should be evenly dispersed throughout regions?
If I need a larger dataset, do I just use a variable to flag women with
children of the correct age for my subsample then and restrict all estimation
commands to the subsample using an if statement?
3) I am interested in looking at biomarkers on a separate subsample who
consented to a blood draw. However, there are no weights that I can
locate for this subsample.
Do I use the same weights as above, or do I need to
create some sort of weight using the rate of consent?
4) I haven't been able to find any variables related to finite population
control, likely because the sampling fraction is small
for DHS. According to my understanding, FPC is not a concern for this
analysis - please correct me if I'm wrong.
Thank you sincerely,
Melissa Daniels
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/