Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: how to generate parent variables matched to their children in household level data set?
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: how to generate parent variables matched to their children in household level data set?
Date
Sat, 23 Feb 2013 01:54:16 +0000
Note that I wrote that FAQ some years ago. Now I think why didn't I
approach that as a -merge- problem? Create a dataset with fathers'
data, one with mothers' data, and -merge- using those. There is still
some fiddling around. This all goes with the simple idea that we have
favourite tools.
Nick
On Sat, Feb 23, 2013 at 1:50 AM, Nick Cox <[email protected]> wrote:
> That's an allusion is to my FAQ
>
> FAQ . . Creating variables recording prop. of the other members of a group
> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
> 4/05 How do I create variables summarizing for each
> individual properties of the other members of a
> group?
>
> http://www.stata.com/support/faqs/data-management/creating-variables-recording-properties/
>
> I don't know why you report problems. The code suggested there works
> as intended. Here it is again run on your example data:
>
> . by ID_fam (ID), sort: gen pid = _n
>
> . gen byte fid = .
> (7 missing values generated)
>
> . gen byte mid = .
> (7 missing values generated)
>
> . summarize pid, meanonly
>
> . forval i = 1 / `r(max)' {
> 2. by ID_fam: replace fid = `i' if ID_F == ID[`i'] &
> !missing(ID_F)
> 3. by ID_fam: replace mid = `i' if ID_M == ID[`i'] &
> !missing(ID_M)
> 4. }
> (3 real changes made)
> (0 real changes made)
> (0 real changes made)
> (3 real changes made)
> (0 real changes made)
> (0 real changes made)
> (0 real changes made)
> (0 real changes made)
>
> . l
>
> +----------------------------------------------------------------------------------+
> | ID_F ID_M BMI ID ID_fam Emp
> pid fid mid |
> |----------------------------------------------------------------------------------|
> 1. | 26.501 A901963701 A9019637 1
> 1 . . |
> 2. | 20.483 A901963702 A9019637 1
> 2 . . |
> 3. | A901963701 A901963702 20.924 A901963703 A9019637 .
> 3 1 2 |
> 4. | 27.209 A901963801 A9019638 1
> 1 . . |
> 5. | 31.733 A901963802 A9019638 .
> 2 . . |
> |----------------------------------------------------------------------------------|
> 6. | A901963801 A901963802 18.018 A901963803 A9019638 .
> 3 1 2 |
> 7. | A901963801 A901963802 19.054 A901963804 A9019638 .
> 4 1 2 |
> +----------------------------------------------------------------------------------+
>
> Using the same logic, we copy parents' employment and mothers' BMI as desired:
>
> . gen BMI_M = .
> (7 missing values generated)
>
> . gen Emp_M = .
> (7 missing values generated)
>
> . gen Emp_F = .
> (7 missing values generated)
>
> . summarize pid, meanonly
>
> . forval i = 1 / `r(max)' {
> 2. by ID_fam: replace BMI_M = BMI[`i'] if ID_M == ID[`i'] & !missing(ID_M)
> 3. by ID_fam: replace Emp_M = Emp[`i'] if ID_M == ID[`i'] & !missing(ID_M)
> 4. by ID_fam: replace Emp_F = Emp[`i'] if ID_F == ID[`i'] & !missing(ID_F)
> 5. }
> (0 real changes made)
> (0 real changes made)
> (3 real changes made)
> (3 real changes made)
> (1 real change made)
> (0 real changes made)
> (0 real changes made)
> (0 real changes made)
> (0 real changes made)
> (0 real changes made)
> (0 real changes made)
> (0 real changes made)
>
>
> Here are the results:
>
> . l
>
> +-----------------------------------------------------------------------------------------------+
> | ID_F ID_M BMI ID ID_fam Emp
> pid BMI_M Emp_M Emp_F |
> |-----------------------------------------------------------------------------------------------|
> 1. | 26.501 A901963701 A9019637 1
> 1 . . . |
> 2. | 20.483 A901963702 A9019637 1
> 2 . . . |
> 3. | A901963701 A901963702 20.924 A901963703 A9019637 .
> 3 20.483 1 1 |
> 4. | 27.209 A901963801 A9019638 1
> 1 . . . |
> 5. | 31.733 A901963802 A9019638 .
> 2 . . . |
> |-----------------------------------------------------------------------------------------------|
> 6. | A901963801 A901963802 18.018 A901963803 A9019638 .
> 3 31.733 . 1 |
> 7. | A901963801 A901963802 19.054 A901963804 A9019638 .
> 4 31.733 . 1 |
> +-----------------------------------------------------------------------------------------------+
>
> Nick
>
> On Fri, Feb 22, 2013 at 10:45 PM, Haena Lee <[email protected]> wrote:
>
>> I am working on investigating the relationship between maternal
>> employment status and prevalence of childhood obesity using a
>> nationally representative data (KNHANES). Suppose I have ID(all
>> observations including both children and parents), ID_fam (household
>> indicator),
>> ID_F( father's ID), ID_M (mother's ID), BMI (body mass index) and
>> finally Emp (employment status 1 if employed; 0 if non-employed) as
>> the following;
>>
>> ID_F ID_M BMI ID ID_fam Emp
>> 26.501 A901963701 A9019637 1
>> 20.483 A901963702 A9019637 1
>> A901963701 A901963702 20.924 A901963703 A9019637 .
>> 27.209 A901963801 A9019638 1
>> 31.733 A901963802 A9019638 .
>> A901963801 A901963802 18.018 A901963803 A9019638 .
>> A901963801 A901963802 19.054 A901963804 A9019638 .
>>
>> And ultimately, I would like to have a data set like this following;
>>
>> ID (children) ID_fam BMI Mom's Bmi Mom's Emp Dad's Emp
>> A901963703 A9019637 20.924 20.483 1 1
>> A901963803 A9019638 18.018 31.733 . 1
>> A901963804 A9019638 19.054 31.733 . 1
>>
>> Given this, my question is 1) how to map the properties of other
>> family members to children within each household, using loop, or 2)
>> how to generate an indicator of mother (1 if ID == ID_M; 0 otherwise)?
>> I found Nick Cox's helpful example and imitated it as the following;
>>
>> by ID_fam (ID), sort: gen pid = _n
>> gen byte fid = .
>> gen byte mid = .
>> summarize pid, meanonly
>> forval i = 1 / `r(max)' {
>> by ID_fam: replace fid = `i'
>> if ID_F == ID[`i'] & !missing(ID_F)
>> by ID_fam: replace mid = `i'
>> if ID_M == ID[`i'] & !missing(ID_M)
>> }
>>
>> And it didn't produce any meaningful values but missing. Please
>> advise. Thank you so much for any help in advance.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/