Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Analysis on subset of Demographic and Health Survey (DHS) data
From
french Smith <[email protected]>
To
[email protected]
Subject
Re: st: Analysis on subset of Demographic and Health Survey (DHS) data
Date
Sun, 27 Oct 2013 14:00:37 -0400
Dear Friedrich,
Apologies for not having noted my Stata version, and for other
omissions of important info.
My Stata output is below.
I use version 10.
I expected 8,081 cases because page 14 of the report says there were
8,081 children aged 0-4.
Thank you for reminding me that Stata is case sensitive - I'm happy
that there is an easy fix to that issue.
When you analyse DHS, do you analyse the same file type I was using? I
saw that in one of the posts you note you prefer using the flat file.
Best,
French
. set mem 450m
(460800k)
.
. set matsize 800
.
. use TZKR62DT/TZKR62FL.DTA
. *v022 is sample stratum number
. gen stratid = v022
.
. *v021 is PSU
. gen psu = v021
.
. *v005 is sample weight
. gen weight = v005/1000000
.
. svyset psu [pw=weight], strata(stratid)
pweight: weight
VCE: linearized
Single unit: missing
Strata 1: stratid
SU 1: psu
FPC 1: <zero>
. *Just to compare totals. Should get 8,081 for Tanzania DHS 2010 not 8,023!
. tab stratid
stratid | Freq. Percent Cum.
------------+-----------------------------------
1 | 30 0.37 0.37
2 | 83 1.03 1.41
3 | 38 0.47 1.88
4 | 41 0.51 2.39
5 | 63 0.79 3.18
6 | 58 0.72 3.90
7 | 194 2.42 6.32
8 | 27 0.34 6.66
9 | 45 0.56 7.22
10 | 39 0.49 7.70
11 | 40 0.50 8.20
12 | 46 0.57 8.77
13 | 57 0.71 9.49
14 | 39 0.49 9.97
15 | 44 0.55 10.52
16 | 40 0.50 11.02
17 | 36 0.45 11.47
18 | 16 0.20 11.67
19 | 83 1.03 12.70
20 | 54 0.67 13.37
21 | 27 0.34 13.71
23 | 15 0.19 13.90
24 | 278 3.47 17.36
25 | 34 0.42 17.79
26 | 84 1.05 18.83
27 | 266 3.32 22.15
28 | 176 2.19 24.34
29 | 112 1.40 25.74
30 | 181 2.26 27.99
31 | 188 2.34 30.34
32 | 192 2.39 32.73
33 | 13 0.16 32.89
34 | 200 2.49 35.39
35 | 154 1.92 37.31
36 | 213 2.65 39.96
37 | 189 2.36 42.32
38 | 233 2.90 45.22
39 | 312 3.89 49.11
40 | 445 5.55 54.66
41 | 310 3.86 58.52
42 | 292 3.64 62.16
43 | 488 6.08 68.24
44 | 300 3.74 71.98
45 | 363 4.52 76.51
46 | 398 4.96 81.47
47 | 264 3.29 84.76
48 | 327 4.08 88.83
49 | 239 2.98 91.81
50 | 56 0.70 92.51
51 | 331 4.13 96.63
52 | 270 3.37 100.00
------------+-----------------------------------
Total | 8,023 100.00
.
. *keep if complete response
. keep if v015==1
(0 observations deleted)
. *keep if child lives with respondent
. keep if b9==0
(824 observations deleted)
On Sun, Oct 27, 2013 at 9:48 AM, Friedrich Huebler <[email protected]> wrote:
> French,
>
> I have to retract my comment about the filename. The -use- command is
> valid if you have a folder TZKR62DT that contains the file
> TZKR62FL.DTA.
>
> Sorry for my mistake,
>
> Friedrich
>
> On Sun, Oct 27, 2013 at 9:22 AM, Friedrich Huebler <[email protected]> wrote:
>> French,
>>
>> Please have a look at the Statalist FAQ (link at the bottom of this
>> message). Some excerpts:
>>
>> "The current version of Stata is 13.0. Please specify if you are using
>> an earlier version"
>>
>> The fact that you use -set mem- indicates that you have Stata 11 or older.
>>
>> "Say exactly what you typed and exactly what Stata typed (or did) in response."
>>
>> I assume that you did not type "use TZKR62DT/TZKR62FL.DTA" because
>> that is not a valid filename in most operating systems. You also don't
>> explain why the command "keep if B9==0" is not recognized because you
>> omitted the output from Stata.
>>
>> Now to your questions. The data for children under 5 from the Tanzania
>> DHS 2010 has only 8023 observations. It is not clear why you expect
>> 8081 observations. The survey report, which you cite as reference, has
>> 480 pages.
>>
>> The command -keep if B9==0- will yield an error message because the
>> variable B9 doesn't exist in the data. Stata is case-sensitive and the
>> correct variable name is b9.
>>
>> Friedrich
>>
>> On Sat, Oct 26, 2013 at 12:53 PM, french Smith <[email protected]> wrote:
>>> Dear STATA crowd,
>>>
>>> I wish to analyze the Demographic and Health Surveys (DHS) data on
>>> children under five.
>>>
>>> I have set up my analysis as follows, using at Tanzania DHS 2010 as a
>>> starting point. But I have two questions:
>>>
>>> 1. Why do I have 8,023 cases not the 8,081 that the report (the report
>>> and dataset are available at
>>> http://www.measuredhs.com/what-we-do/survey/survey-display-345.cfm)
>>> says exit?
>>>
>>> 2. How do I restrict the analysis to V465, which was a question asked
>>> only of respondents with a youngest child under five living with them?
>>>
>>> Maybe questions, sorry, but my STATA mind needs jogging!
>>>
>>> Thanks!
>>>
>>> French
>>>
>>>
>>> set mem 450m
>>>
>>> set matsize 800
>>>
>>> use TZKR62DT/TZKR62FL.DTA
>>>
>>> *v022 is sample stratum number
>>> gen stratid = v022
>>>
>>> *v021 is PSU
>>> gen psu = v021
>>>
>>> *v005 is sample weight
>>> gen weight = v005/1000000
>>>
>>> svyset psu [pw=weight], strata(stratid)
>>>
>>> *Just to compare totals. Should get 8,081 for Tanzania DHS 2010 not 8,023!
>>> tab stratid
>>>
>>> *keep if complete response
>>> keep if v015==1
>>>
>>> *keep if child lives with respondent. But this command isn’t recognized!
>>> keep if B9==0
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/