Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: matching cases by a transitive relation
From
Robert Picard <[email protected]>
To
[email protected]
Subject
Re: st: matching cases by a transitive relation
Date
Sun, 13 Jan 2013 18:15:59 -0500
If I understand the problem correctly, I think that this can be solved
easily using -group_id- (available from SSC). Here's an example of how
I would proceed:
*------------------------------ sample code -------------------
clear
input sibling1 sibling2
1 2
2 1
2 3
4 5
5 4
4 8
7 9
9 7
10 3
end
gen pairid = _n
* convert from wide to long the identifiers
expand 2
sort pairid
by pairid: gen id = sibling1 if _n == 1
by pairid: replace id = sibling2 if _n == 2
* group the initial relationship when the id match
gen sibling_group = pairid
group_id sibling_group, matchby(id)
* pick one record per id within a sibling_group
sort sibling_group id pairid
by sibling_group id: gen pick = _n == 1
list sibling_group id if pick, noobs sepby(sibling_group)
*------------------------------ end sample code ---------------
On Fri, Jan 11, 2013 at 7:03 AM, Robert De Vries
<[email protected]> wrote:
> Dear Statalisters,
>
> I have a problem with attempting to match cases by a transitive relation (A is related to B, B is related to C, so C must be related to A).
>
> Specifically, I am working with the longitudinal British Household Panel Study (BHPS), and I am attempting to match siblings across time. I can straightforwardly create a dataset which includes the ID number of all sibling pairs in the dataset in the following format:
>
> ID | SIBLING ID
> A | B
> B | A
> B | C
>
> However, this dataset does not reflect the additional relationship A-C. This occurs when A and C are siblings but have never actually lived together. For example, in Wave 1, A and B are siblings living together. By Wave 2, A has moved out, and B has gained a new sibling; C (this might be a step-sibling, for example, or a new birth). My dataset reflects that fact that A and B are siblings, and that B and C are siblings, but because A and C have never been coded as siblings, my dataset does not reflect that they are.
>
> By their transitive relation through B, we know that A and C are siblings. My question is: what code could I write to get the dataset to reflect this? I need to somehow tell Stata that if A is related to B AND B is related to C, you need to create a new case which reflects that A is related to C.
>
> Hope you can help!
>
> Robert de Vries
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/