Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: how to generate parent variables matched to their children in household level data set?
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: how to generate parent variables matched to their children in household level data set?
Date
Sat, 23 Feb 2013 01:50:08 +0000
That's an allusion is to my FAQ
FAQ . . Creating variables recording prop. of the other members of a group
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
4/05 How do I create variables summarizing for each
individual properties of the other members of a
group?
http://www.stata.com/support/faqs/data-management/creating-variables-recording-properties/
I don't know why you report problems. The code suggested there works
as intended. Here it is again run on your example data:
. by ID_fam (ID), sort: gen pid = _n
. gen byte fid = .
(7 missing values generated)
. gen byte mid = .
(7 missing values generated)
. summarize pid, meanonly
. forval i = 1 / `r(max)' {
2. by ID_fam: replace fid = `i' if ID_F == ID[`i'] &
!missing(ID_F)
3. by ID_fam: replace mid = `i' if ID_M == ID[`i'] &
!missing(ID_M)
4. }
(3 real changes made)
(0 real changes made)
(0 real changes made)
(3 real changes made)
(0 real changes made)
(0 real changes made)
(0 real changes made)
(0 real changes made)
. l
+----------------------------------------------------------------------------------+
| ID_F ID_M BMI ID ID_fam Emp
pid fid mid |
|----------------------------------------------------------------------------------|
1. | 26.501 A901963701 A9019637 1
1 . . |
2. | 20.483 A901963702 A9019637 1
2 . . |
3. | A901963701 A901963702 20.924 A901963703 A9019637 .
3 1 2 |
4. | 27.209 A901963801 A9019638 1
1 . . |
5. | 31.733 A901963802 A9019638 .
2 . . |
|----------------------------------------------------------------------------------|
6. | A901963801 A901963802 18.018 A901963803 A9019638 .
3 1 2 |
7. | A901963801 A901963802 19.054 A901963804 A9019638 .
4 1 2 |
+----------------------------------------------------------------------------------+
Using the same logic, we copy parents' employment and mothers' BMI as desired:
. gen BMI_M = .
(7 missing values generated)
. gen Emp_M = .
(7 missing values generated)
. gen Emp_F = .
(7 missing values generated)
. summarize pid, meanonly
. forval i = 1 / `r(max)' {
2. by ID_fam: replace BMI_M = BMI[`i'] if ID_M == ID[`i'] & !missing(ID_M)
3. by ID_fam: replace Emp_M = Emp[`i'] if ID_M == ID[`i'] & !missing(ID_M)
4. by ID_fam: replace Emp_F = Emp[`i'] if ID_F == ID[`i'] & !missing(ID_F)
5. }
(0 real changes made)
(0 real changes made)
(3 real changes made)
(3 real changes made)
(1 real change made)
(0 real changes made)
(0 real changes made)
(0 real changes made)
(0 real changes made)
(0 real changes made)
(0 real changes made)
(0 real changes made)
Here are the results:
. l
+-----------------------------------------------------------------------------------------------+
| ID_F ID_M BMI ID ID_fam Emp
pid BMI_M Emp_M Emp_F |
|-----------------------------------------------------------------------------------------------|
1. | 26.501 A901963701 A9019637 1
1 . . . |
2. | 20.483 A901963702 A9019637 1
2 . . . |
3. | A901963701 A901963702 20.924 A901963703 A9019637 .
3 20.483 1 1 |
4. | 27.209 A901963801 A9019638 1
1 . . . |
5. | 31.733 A901963802 A9019638 .
2 . . . |
|-----------------------------------------------------------------------------------------------|
6. | A901963801 A901963802 18.018 A901963803 A9019638 .
3 31.733 . 1 |
7. | A901963801 A901963802 19.054 A901963804 A9019638 .
4 31.733 . 1 |
+-----------------------------------------------------------------------------------------------+
Nick
On Fri, Feb 22, 2013 at 10:45 PM, Haena Lee <[email protected]> wrote:
> I am working on investigating the relationship between maternal
> employment status and prevalence of childhood obesity using a
> nationally representative data (KNHANES). Suppose I have ID(all
> observations including both children and parents), ID_fam (household
> indicator),
> ID_F( father's ID), ID_M (mother's ID), BMI (body mass index) and
> finally Emp (employment status 1 if employed; 0 if non-employed) as
> the following;
>
> ID_F ID_M BMI ID ID_fam Emp
> 26.501 A901963701 A9019637 1
> 20.483 A901963702 A9019637 1
> A901963701 A901963702 20.924 A901963703 A9019637 .
> 27.209 A901963801 A9019638 1
> 31.733 A901963802 A9019638 .
> A901963801 A901963802 18.018 A901963803 A9019638 .
> A901963801 A901963802 19.054 A901963804 A9019638 .
>
> And ultimately, I would like to have a data set like this following;
>
> ID (children) ID_fam BMI Mom's Bmi Mom's Emp Dad's Emp
> A901963703 A9019637 20.924 20.483 1 1
> A901963803 A9019638 18.018 31.733 . 1
> A901963804 A9019638 19.054 31.733 . 1
>
> Given this, my question is 1) how to map the properties of other
> family members to children within each household, using loop, or 2)
> how to generate an indicator of mother (1 if ID == ID_M; 0 otherwise)?
> I found Nick Cox's helpful example and imitated it as the following;
>
> by ID_fam (ID), sort: gen pid = _n
> gen byte fid = .
> gen byte mid = .
> summarize pid, meanonly
> forval i = 1 / `r(max)' {
> by ID_fam: replace fid = `i'
> if ID_F == ID[`i'] & !missing(ID_F)
> by ID_fam: replace mid = `i'
> if ID_M == ID[`i'] & !missing(ID_M)
> }
>
> And it didn't produce any meaningful values but missing. Please
> advise. Thank you so much for any help in advance.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/