Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: identify family members using -egen (group)
From
Nick Cox <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
st: RE: identify family members using -egen (group)
Date
Fri, 11 Nov 2011 10:15:14 +0000
-egen, group()- cannot help you here without some prior work. The premise of -egen, group()- is that it groups observations with identical values on one or more variables. Your data do not satisfy that.
You could apply -rowsort- first:
. l
+-------------------------------------+
| id fam~1_id fam~2_id fam~3_id |
|-------------------------------------|
1. | 1 missing missing missing |
2. | 2 3 missing missing |
3. | 3 2 missing missing |
4. | 4 5 6 missing |
5. | 5 4 6 missing |
|-------------------------------------|
6. | 6 4 5 missing |
+-------------------------------------+
. rowsort id f*id , gen(s1-s4)
. l
+------------------------------------------------------------------------+
| id fam~1_id fam~2_id fam~3_id s1 s2 s3 s4 |
|------------------------------------------------------------------------|
1. | 1 missing missing missing 1 missing missing missing |
2. | 2 3 missing missing 2 3 missing missing |
3. | 3 2 missing missing 2 3 missing missing |
4. | 4 5 6 missing 4 5 6 missing |
5. | 5 4 6 missing 4 5 6 missing |
|------------------------------------------------------------------------|
6. | 6 4 5 missing 4 5 6 missing |
+------------------------------------------------------------------------+
. egen group = group(s*)
. l
+--------------------------------------------------------------------------------+
| id fam~1_id fam~2_id fam~3_id s1 s2 s3 s4 group |
|--------------------------------------------------------------------------------|
1. | 1 missing missing missing 1 missing missing missing 1 |
2. | 2 3 missing missing 2 3 missing missing 2 |
3. | 3 2 missing missing 2 3 missing missing 2 |
4. | 4 5 6 missing 4 5 6 missing 3 |
5. | 5 4 6 missing 4 5 6 missing 3 |
|--------------------------------------------------------------------------------|
6. | 6 4 5 missing 4 5 6 missing 3 |
+--------------------------------------------------------------------------------+
You must install -rowsort- first. -rowsort- is described in
SJ-9-1 pr0046 . . . . . . . . . . . . . . . . . . . Speaking Stata: Rowwise
(help rowsort, rowranks if installed) . . . . . . . . . . . N. J. Cox
Q1/09 SJ 9(1):137--157
shows how to exploit functions, egen functions, and Mata
for working rowwise; rowsort and rowranks are introduced
and may be downloaded from the Stata Journal files regardless of whether you subscribe to the Stata Journal. The article is apparently a good one anyway.
Note that I copied your example as if you had strings, but -rowsort- works with numeric variables too.
Nick
[email protected]
Amanda Fu
I know there are discussions on how to identify siblings before on
statalist. The solutions are using the same mother and fathor's ID.
But I still have not figured out how to identify family members in my
data set, since there are no parents' ID.
The data set looks like as follows:
---------------------------------------
id fam_member1_id fam_member2_id fam_member3_id
1 missing missing missing
2 3 missing missing
3 2 missing missing
4 5 6 missing
5 4 6 missing
6 4 5 missing
............
----------------------------------
That is, ID 2 and 3; 4,5, and 6 are in the same families.
I tried to use
. egen famid =group(id fam_member1_id fam_member2_id
fam_member3_id),missing
but the famid I got is not the same for a family.
----------------------------------the last column is what I want to get-----
id fam_member1_id fam_member2_id fam_member3_id famid
1 missing missing
missing missing
2 3 missing
missing 1
3 2 missing
missing 1
4 5 6
missing 2
5 4 6
missing 2
6 4 5
missing 2
............
----------------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/