st: RE: identify family members using -egen (group)

From   Nick Cox <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   st: RE: identify family members using -egen (group)
Fri, 11 Nov 2011 10:15:14 +0000

-egen, group()- cannot help you here without some prior work. The premise of -egen, group()- is that it groups observations with identical values on one or more variables. Your data do not satisfy that. 

You could apply -rowsort- first: 

. l

     | id   fam~1_id   fam~2_id   fam~3_id |
  1. |  1    missing    missing    missing |
  2. |  2          3    missing    missing |
  3. |  3          2    missing    missing |
  4. |  4          5          6    missing |
  5. |  5          4          6    missing |
  6. |  6          4          5    missing |

. rowsort id f*id , gen(s1-s4)

. l

     | id   fam~1_id   fam~2_id   fam~3_id   s1        s2        s3        s4 |
  1. |  1    missing    missing    missing    1   missing   missing   missing |
  2. |  2          3    missing    missing    2         3   missing   missing |
  3. |  3          2    missing    missing    2         3   missing   missing |
  4. |  4          5          6    missing    4         5         6   missing |
  5. |  5          4          6    missing    4         5         6   missing |
  6. |  6          4          5    missing    4         5         6   missing |

. egen group = group(s*)

. l

     | id   fam~1_id   fam~2_id   fam~3_id   s1        s2        s3        s4   group |
  1. |  1    missing    missing    missing    1   missing   missing   missing       1 |
  2. |  2          3    missing    missing    2         3   missing   missing       2 |
  3. |  3          2    missing    missing    2         3   missing   missing       2 |
  4. |  4          5          6    missing    4         5         6   missing       3 |
  5. |  5          4          6    missing    4         5         6   missing       3 |
  6. |  6          4          5    missing    4         5         6   missing       3 |

You must install -rowsort- first. -rowsort- is described in 

SJ-9-1  pr0046  . . . . . . . . . . . . . . . . . . .  Speaking Stata: Rowwise
        (help rowsort, rowranks if installed) . . . . . . . . . . .  N. J. Cox
        Q1/09   SJ 9(1):137--157
        shows how to exploit functions, egen functions, and Mata
        for working rowwise; rowsort and rowranks are introduced

and may be downloaded from the Stata Journal files regardless of whether you subscribe to the Stata Journal. The article is apparently a good one anyway. 

Note that I copied your example as if you had strings, but -rowsort- works with numeric variables too. 

[email protected] 

Amanda Fu

I know there are discussions on how to identify siblings before on
statalist. The solutions are using the same mother and fathor's ID.
But I still have not figured out  how to identify family members in my
data set, since there are no parents' ID.

The data set looks like as follows:
id       fam_member1_id    fam_member2_id   fam_member3_id

1            missing                   missing missing
2            3                             missing                     missing
3            2                             missing                     missing
4            5                              6     missing
5            4                              6     missing
6            4                              5     missing
That is, ID 2 and 3; 4,5, and 6 are in the same families.
I tried to use
. egen famid =group(id  fam_member1_id    fam_member2_id
but the famid I got is not the same for a family.
----------------------------------the last column is what I want to get-----
id       fam_member1_id    fam_member2_id   fam_member3_id     famid

1             missing                  missing
missing            missing
2            3                             missing
missing              1
3            2                             missing
missing              1
4            5                              6
    missing              2
5            4                              6
    missing              2
6            4                              5
    missing              2

