Hi,
the simpliest way (in both implementation and understanding) is to
1) create a file with two variables pers_id and educ
here pers_id must be a unique identifier of a person and educ -
her education
basically you almost have it, but your iid is not unique --
combine hhid and iid, say by gen pers_id=iid+hhid*100 (almost surely
you don't have more than 99 family members).
2) keep only pers_id and grade_completed, save the file
3) go back to the result of #1
4) replace fatherid to father_id=fatherid+100*hhid (and similar for mothers).
5) merge by this unique id with the file you saved in #2
hint: you will have to temporarily rename variables on your way
Best regards, Sergiy
On 8/16/07, Austin Nichols <[email protected]> wrote:
> Prabal Kr. De--
> There is a trick using explicit subscripting, which requires you have
> no gap in individual IDs and that they correspond to observation
> numbers within household, as they do in your example. Then you can
> just:
>
> bys hhid (iid): g fathered=grade[fatherid]
> bys hhid (iid): g mothered=grade[motherid]
>
> If you have gaps in individual IDs, you have to use some trickery
> involving -fillin- to exploit the same trick. Here is an example that
> tests for gaps and automates the trickery:
>
> clear
> input hhid iid fatherid motherid grade
> 1001 1 . . 6
> 1001 2 . . 5
> 1001 3 1 2 3
> 1001 4 1 2 3
> 1002 1 . . 5
> 1002 2 . . 6
> 1002 3 1 2 8
> 1002 4 . . 8
> 1002 5 3 4 1
> 1002 7 3 4 1
> end
> cap bys hhid (iid): assert iid==iid[_n-1]+1 if _n>1
> if _rc!=0 {
> set obs `=_N+1'
> su hhid, meanonly
> loc fake=r(max)+1
> replace hhid=`fake' in l
> su iid, meanonly
> expand `r(max)' in l
> bys hhid: replace iid=_n if hhid==`fake'
> fillin hhid iid
> drop if hhid==`fake'
> }
> bys hhid (iid): g fathered=grade[fatherid]
> bys hhid (iid): g mothered=grade[motherid]
> cap drop if _fillin==1
> cap drop _fillin
> li, noo clean
>
>
> On 8/16/07, Prabal Kr. De <[email protected]> wrote:
> > Hi!
> > I have a household survey dataset which gives
> > householdid (hhid) individual id (iid) and
> > corresponding ids for father and mother. Therefore the
> > data looks like
> >
> > hhid iid fatherid motherid grade_completed
> > 1001 1 . . 6
> > 1001 2 . . 5
> > 1001 3 1 2 3
> > 1001 4 1 2 3
> > 1002 1 . . 5
> > 1002 2 . . 6
> > 1002 3 1 2 8
> > 1002 4 . . 8
> > 1002 5 3 4 1
> >
> > I want to create two variables: fatheredu (father's
> > education) and motheredu (mother's education)
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/