Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: merging the newly created height and weight DHS data sets with child data set
From
Friedrich Huebler <[email protected]>
To
[email protected]
Subject
Re: st: merging the newly created height and weight DHS data sets with child data set
Date
Thu, 22 Aug 2013 11:38:32 -0400
Donela,
The variables v001, v002 and b16 do not uniquely identify observations
in the child data from the Ethiopia DHS 2005.
. duplicates report v001 v002 b16
Duplicates in terms of v001 v002 b16
--------------------------------------
copies | observations surplus
----------+---------------------------
1 | 9672 0
2 | 148 74
3 | 21 14
4 | 20 15
--------------------------------------
Some children are not included in the household member file and
therefore have the value 0 or a missing value in b16. You can correct
this with the commands below.
. drop if b16==0 | b16==.
(1006 observations deleted)
. duplicates report v001 v002 b16
Duplicates in terms of v001 v002 b16
--------------------------------------
copies | observations surplus
----------+---------------------------
1 | 8855 0
--------------------------------------
Friedrich
On Thu, Aug 22, 2013 at 10:43 AM, Eric A. Booth <[email protected]> wrote:
> <>
>
> Try opening each of the files and then seeing if the variables you are
> merging are uniquely identify the records (we already can guess that
> they do not, but now you want to find out why). You'll have to
> investigate your data and make the call about how you deal with your
> merge variables.
>
> If there are many records per identifier (or variables that together
> uniquely identify the records you are merging) in the 'using'
> datafile, determine: do you want to merge all of them in? do you
> want to aggregate/collapse them first in some way ? do you want to
> reshape them from long to wide before merging them in ?
>
> All of this requires fully understanding your data structure based on
> examining the data and the codebook and the extensive documentation
> that DHS provides (I'm not familiar with these, but I've seen the
> website and there is a lot there on this topic).
>
> To check if the variables you are merging on are unique identifiers,
> start with commands like -isid- and -duplicates tag- and then
> investigate the duplicates to figure out what you want to do with them
> for the merge.
>
> Also, there is a lot of information on merging DHS data in Stata on
> the internet, e.g.,
>
> http://www.cpc.unc.edu/research/tools/data_analysis/statatutorial/example9
> http://userforum.measuredhs.com/index.php?t=rview&goto=217&th=49
> http://statalist.1588530.n2.nabble.com/st-DHS-Ghana-merging-question-td3128338.html
> http://www.stata.com/statalist/archive/2009-11/msg00240.html
>
> - Eric
>
>
>
>
> On Thu, Aug 22, 2013 at 9:26 AM, Donela Besada <[email protected]> wrote:
>> Hi Nick,
>>
>> Thanks for the response. To be honest, I am not sure, but I am
>> assuming that they have to be, since they are picking the same
>> variables in each of the data sets to merge on, they are just called
>> something different in each data set.
>>
>> I don't really know what other options to try. I am trying to follow
>> the instructions, but they are not very clear:
>>
>> I have pasted here the full instructions:
>>
>> HEIGHT AND WEIGHT – WHO CHILD GROWTH STANDARD FILES
>>
>> The Height and Weight – WHO Child Growth Standard files contain the
>> standard deviations for the height for age, weight for age, weight for
>> height, and BMI according to the new WHO definition. These data are
>> available in the standard distributed DHS-V data files but not for
>> previous recode definitions. The new WHO scores for recode
>> definitions 4 and below need to be merged with the corresponding
>> standard recode files for analysis purposes.
>>
>> In the early phases of DHS, the Height and Weight data was collected
>> for children of interviewed women; but in the last two rounds the data
>> were collected for all children in the households. Variable HWLEVEL
>> in this file indicates whether the anthropometry data was collected at
>> the household or the woman’s level. Code 1 in that variable indicates
>> that height and weight was collected at the household level and code 2
>> indicates that it was collected at the woman’s level.
>>
>> The Height and Weight data collected for children of interviewed women
>> can be merged with either their mothers or the children themselves as
>> follows:
>>
>> Use HWCASEID from the Height and Weight file with CASEID from the
>> Individual Recode to merge it with the mother’s data.
>>
>> Use HWCASEID and HWLINE, from the Height and Weight file, with CASEID
>> and MIDX, from the Children's recode file to merge it with the
>> Children’s data.
>>
>> Use HWCASEID and HWLINE, from the Height and Weight file, with CASEID
>> and BIDX, from the Births Recode file to merge it with the Births’
>> data.
>>
>> The Height and Weight data collected for children at the household
>> level can be matched to the households, to the members, to the
>> mothers, or to the children.
>>
>> Use HWHHID from the Height and Weight data file with HHID from the
>> Household Recode file to merge it with the household data where the
>> child was measured.
>>
>> Use HWHHID and HWLINE from the Height and Weight file with HHID and
>> HVIDX from the Members Recode file to merge it with the household
>> member data.
>>
>> Once the Height and weight data are merged to the household members’
>> file, the resulting file could then be merged with the mother’s and
>> children’s file, as follows:
>>
>> Use HV001 (cluster number) plus HV002 (household number) and HC60
>> (mother’s line number) from the constructed file and merge it with the
>> corresponding V001, V002 and V003 from the Individual Recode file.
>>
>> Use HV001 (cluster number) plus HV002 (household number) and HWLINE
>> from the member’s constructed file (or the one resulting from the
>> previous merge), and merge it with the corresponding V001, V002, B16
>> (child’s line number in the household) in the Children Recode file or
>> in the Births Recode file.
>>
>>
>> On Thu, Aug 22, 2013 at 4:14 PM, Nick Cox <[email protected]> wrote:
>>> The devil is in the details. This is not my field at all but are _all_
>>> these -merge-s really 1:1?
>>> Nick
>>> [email protected]
>>>
>>>
>>> On 22 August 2013 15:10, Donela Besada <[email protected]> wrote:
>>>> Hello, thank you for your response.
>>>>
>>>> So this is what I did:
>>>> I first opened the height and weight data set and renamed the two
>>>> variables I need to merge on so that they correspond to the name in
>>>> the household member data set
>>>>
>>>> gen hhid=hwhhid
>>>> gen hvidx=hwline
>>>>
>>>> Then I opened the household member data set and I did merge it with
>>>> the anthropometric data set
>>>> Type of merge: one to one on key variables
>>>>
>>>> merge 1:1 hvidx hhid using
>>>> "/Users/donelabesada/Desktop/IHSS/Ethiopia/ETHW51DT_2005
>>>> anthro/ETHW51FL.DTA"
>>>>
>>>> Result # of obs.
>>>> -----------------------------------------
>>>> not matched 62,591
>>>> from master 62,591 (_merge==1)
>>>> from using 0 (_merge==2)
>>>>
>>>> matched 4,949 (_merge==3)
>>>> -----------------------------------------
>>>>
>>>>
>>>> After that I renamed the variables I am instructed to merge on to
>>>> reflect the same variables in the child data set:
>>>>
>>>> rename hv001 v001
>>>> rename hv002 v002
>>>> rename hvidx b16
>>>>
>>>> I then saved that file and opened the child data set. I then tried to
>>>> merge this child data set with my newly created merged file-again one
>>>> to one on key variables
>>>> merge 1:1 v001 v002 b16 using
>>>> "/Users/donelabesada/Desktop/ETPR51FL_householdmember_anthro_merge.dta"
>>>>
>>>> When I did this I got the below error:
>>>>
>>>> variables v001 v002 b16 do not uniquely identify observations in the master data
>>>>
>>>> I am using Ethiopia 2005 data for this merging.
>>>>
>>>> I would appreciate any help anyone can offer.
>>>>
>>>> Thank you very much.
>>>>
>>>> Warm wishes,
>>>> Donela
>>>>
>>>> On Thu, Aug 22, 2013 at 4:01 PM, Nick Cox <[email protected]> wrote:
>>>>> Please show the -merge- command you typed to increase the chance of a
>>>>> good answer.
>>>>> Nick
>>>>> [email protected]
>>>>>
>>>>>
>>>>> On 22 August 2013 14:55, Donela Besada <[email protected]> wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I was wondering if anyone could help me. I am trying to follow the WHO
>>>>>> instructions on how to merge the new height and age data sets with the
>>>>>> original child data sets. I am able to successfully merge the height
>>>>>> and weight data with the household member data. I am having some
>>>>>> trouble with the second step of merging that data set to the child
>>>>>> data set. When I try to merge on the variables: v001, v002 and b16, I
>>>>>> get the following error:
>>>>>>
>>>>>> "variables v001 v002 b16 do not uniquely identify observations in the
>>>>>> master data"
>>>>>>
>>>>>> Has anyone successfully done this and could you please help if so?
>>>>>>
>>>>>>
>>>>>> Thank you very much
>>>>>> Donela
>>>>>>
>>>>>> WHO instructions:
>>>>>>
>>>>>> The Height and Weight data collected for children at the household
>>>>>> level can be matched to the households, to the members, to the
>>>>>> mothers, or to the children.
>>>>>>
>>>>>> Use HWHHID and HWLINE from the Height and Weight file with HHID and
>>>>>> HVIDX from the Members Recode file to merge it with the household
>>>>>> member data.
>>>>>>
>>>>>> Once the Height and weight data are merged to the household members’
>>>>>> file, the resulting file could then be merged with the mother’s and
>>>>>> children’s file, as follows:
>>>>>>
>>>>>> Use HV001 (cluster number) plus HV002 (household number) and HWLINE
>>>>>> from the member’s constructed file (or the one resulting from the
>>>>>> previous merge), and merge it with the corresponding V001, V002, B16
>>>>>> (child’s line number in the household) in the Children Recode file or
>>>>>> in the Births Recode file.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/