Ricardo Ovaldia ([email protected]) wonders if Mata will help him
solve the following problem:
> I have household data with one observation per family
> member. All House hold have one or both parents and
> anywhere from 1 to seven children. All households have
> children but no grandparents or other relatives. Here
> are a few tipical observations and relevant variables:
>
> . cl familyid subjid relation
>
> familyid subjid relation
> 1. 1001 1 f
> 2. 1001 2 m
> 3. 1001 3 c
> 4. 1001 4 c
> 5. 1002 1 m
> 6. 1002 2 c
> 7. 1002 3 c
> 8. 1003 1 m
> 9. 1003 2 f
> 10. 1003 3 c
>
> where for -relation-: f=father, m=mother and c=child
>
> I want to create two new variables which hold, for the
> children, their parent's -subjectid- as follows:
>
> familyid subjid relation fatherid motherid
> 1. 1001 1 f . .
> 2. 1001 2 m . .
> 3. 1001 3 c 1 2
> 4. 1001 4 c 1 2
> 5. 1002 1 m . .
> 6. 1002 2 c . 2
> 7. 1002 3 c . 2
> 8. 1003 1 m . .
> 9. 1003 2 f . .
> 10. 1003 3 c 2 1
>
> I wrote a program to do this but is very slow because
> it loops over observations.
> I think that if I recode this using -mata- it would be
> faster, but I not sure where to begin. Any assistance
> or suggestions will be greatly appreciated.
Ricardo is correct that if he rewrites his loop in Mata, it
will be faster. However, this is still not the optimal solution.
Mata is useful for many data management tasks, such as reading
and manipulating files and performing left-hand-side indexing (i.e.
when you want to achieve something like
. generate y[somevar] = x
which isn't possible in Stata but is possible using Mata and matrix views
onto the Stata dataset in Mata).
However, Ricardo can achieve his results with just a few Stata commands
and creative sorting:
generate fatherid = subjid if relation=="f"
sort familyid fatherid
by familyid: replace fatherid = fatherid[1] if relation=="c"
replace fatherid = . if relation != "c"
generate motherid = subjid if relation=="m"
sort familyid motherid
by familyid: replace motherid = motherid[1] if relation=="c"
replace motherid = . if relation != "c"
sort familyid subjid
list
Note that the last -list- shows different values for motherid
in observations 6 and 7 from what Ricardo showed in his example.
However, I believe that motherid should be '1' in those two
observations as produced by the code above.
Alan
([email protected])
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/