Thanks Stas Kolenikov.
As per Stas Kolenikov's advice I have added labels, summary statistics
of the relevant vars.
Hi, I am using the following commands to set up DHS (Demographic and
Health Survey data) data for analysis
gen psu = v021
gen strata = v022
gen sampwt = v005/1000000 //as per DHS instruction//
svyset psu [pw = sampwt], strata(strata)
Where,
v005 sample weight
v021 primary sampling unit
v022 sample stratum number
. sum v005 v021 v022
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
v005 | 11440 1000000 479282.7 55728 2707592
v021 | 11440 223.3237 163.2414 1 550
v022 | 11440 89.80385 51.64129 1 177
I have two questions:
1. Is this the right way to set up data ?
2. For the data set I am using, for one year, var V022 is missing.
What other var(s) can I consider to use instead of V022
On Mon, Jul 20, 2009 at 9:52 AM, Stas Kolenikov<[email protected]> wrote:
> Nikh, this is not terribly informative -- give the labels of the
> variables. (As the FAQ of the list says, don't assume that everybody
> knows your data and your literature as well as you do.) You may not
> like the idea of having weights like 10,000 if you are used to think
> about the weight variable as something close to 1, or maybe something
> close to 1/n. But if you want to estimate the total number of people
> in the country that don't have access to clean water, those 10,000
> weights are the right ones to use: the weight of 1 is going to give
> you the total number of people in the sample that don't have access to
> clean water, and you cannot put that sort of stuff into your country
> report. Check DHS documentation again on the survey settings.
>
> To my knowledge, stratification does not change in DHS from year to
> year, so you can keep strata ID from other years if you can match the
> clustdrs. If you have any new PSUs, it may not be possible to
> determine where they are coming from though; you could create a
> separate stratum for all of them. Finally, you can ignore
> stratification whatsoever, and lose some precision/efficiency with
> that.
>
> On Mon, Jul 20, 2009 at 10:21 AM, nikh 2000<[email protected]> wrote:
>> Hi, I am using the following commands to set up DHS (Demographic and
>> Health Survey data) data for analysis
>>
>> gen psu = v021
>> gen strata = v022
>> gen sampwt = v005/1000000
>>
>> svyset psu [pw = sampwt], strata(strata)
>>
>> I have two questions:
>>
>> 1. Is this the right way to set up data ?
>> 2. For the data set I am using, for one year, var V022 is missing.
>> What other var(s) can I consider to use instead of V022
>
>
>
> --
> Stas Kolenikov, also found at http://stas.kolenikov.name
> Small print: I use this email account for mailing lists only.
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/