Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: identify family members using -egen (group)
From
Amanda Fu <[email protected]>
To
[email protected]
Subject
Re: st: RE: identify family members using -egen (group)
Date
Fri, 11 Nov 2011 22:11:28 -0500
Thank you for your kind help, Mr. Cox! The solution you give works
perfectly. I appreciate it.
Sincerely,
Amanda
On Fri, Nov 11, 2011 at 5:15 AM, Nick Cox <[email protected]> wrote:
> -egen, group()- cannot help you here without some prior work. The premise of -egen, group()- is that it groups observations with identical values on one or more variables. Your data do not satisfy that.
>
> You could apply -rowsort- first:
>
> . l
>
> +-------------------------------------+
> | id fam~1_id fam~2_id fam~3_id |
> |-------------------------------------|
> 1. | 1 missing missing missing |
> 2. | 2 3 missing missing |
> 3. | 3 2 missing missing |
> 4. | 4 5 6 missing |
> 5. | 5 4 6 missing |
> |-------------------------------------|
> 6. | 6 4 5 missing |
> +-------------------------------------+
>
> . rowsort id f*id , gen(s1-s4)
>
> . l
>
> +------------------------------------------------------------------------+
> | id fam~1_id fam~2_id fam~3_id s1 s2 s3 s4 |
> |------------------------------------------------------------------------|
> 1. | 1 missing missing missing 1 missing missing missing |
> 2. | 2 3 missing missing 2 3 missing missing |
> 3. | 3 2 missing missing 2 3 missing missing |
> 4. | 4 5 6 missing 4 5 6 missing |
> 5. | 5 4 6 missing 4 5 6 missing |
> |------------------------------------------------------------------------|
> 6. | 6 4 5 missing 4 5 6 missing |
> +------------------------------------------------------------------------+
>
> . egen group = group(s*)
>
> . l
>
> +--------------------------------------------------------------------------------+
> | id fam~1_id fam~2_id fam~3_id s1 s2 s3 s4 group |
> |--------------------------------------------------------------------------------|
> 1. | 1 missing missing missing 1 missing missing missing 1 |
> 2. | 2 3 missing missing 2 3 missing missing 2 |
> 3. | 3 2 missing missing 2 3 missing missing 2 |
> 4. | 4 5 6 missing 4 5 6 missing 3 |
> 5. | 5 4 6 missing 4 5 6 missing 3 |
> |--------------------------------------------------------------------------------|
> 6. | 6 4 5 missing 4 5 6 missing 3 |
> +--------------------------------------------------------------------------------+
>
>
> You must install -rowsort- first. -rowsort- is described in
>
> SJ-9-1 pr0046 . . . . . . . . . . . . . . . . . . . Speaking Stata: Rowwise
> (help rowsort, rowranks if installed) . . . . . . . . . . . N. J. Cox
> Q1/09 SJ 9(1):137--157
> shows how to exploit functions, egen functions, and Mata
> for working rowwise; rowsort and rowranks are introduced
>
> and may be downloaded from the Stata Journal files regardless of whether you subscribe to the Stata Journal. The article is apparently a good one anyway.
>
> Note that I copied your example as if you had strings, but -rowsort- works with numeric variables too.
>
> Nick
> [email protected]
>
> Amanda Fu
>
> I know there are discussions on how to identify siblings before on
> statalist. The solutions are using the same mother and fathor's ID.
> But I still have not figured out how to identify family members in my
> data set, since there are no parents' ID.
>
> The data set looks like as follows:
> ---------------------------------------
> id fam_member1_id fam_member2_id fam_member3_id
>
> 1 missing missing missing
> 2 3 missing missing
> 3 2 missing missing
> 4 5 6 missing
> 5 4 6 missing
> 6 4 5 missing
> ............
> ----------------------------------
> That is, ID 2 and 3; 4,5, and 6 are in the same families.
> I tried to use
> . egen famid =group(id fam_member1_id fam_member2_id
> fam_member3_id),missing
> but the famid I got is not the same for a family.
> ----------------------------------the last column is what I want to get-----
> id fam_member1_id fam_member2_id fam_member3_id famid
>
> 1 missing missing
> missing missing
> 2 3 missing
> missing 1
> 3 2 missing
> missing 1
> 4 5 6
> missing 2
> 5 4 6
> missing 2
> 6 4 5
> missing 2
> ............
> ----------------------------------
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/