Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <n.j.cox@durham.ac.uk> |
To | "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> |
Subject | st: RE: generating variables based on the co-occurrence of ids in groups over time |
Date | Wed, 7 Mar 2012 12:17:15 +0000 |
Here are some doodlings: tab ind_id, gen(ind_id) drop ind_id foreach v of var ind_id* { local call `call' (sum) `v' } collapse `call', by(year project_id) l egen count_id = rowtotal(ind_id*) unab ind_id : ind_id* local ind_id : subinstr local ind_id "ind_id" "", all foreach id of local ind_id { gen collab`id' = count_id - ind_id`id' if ind_id`id' == 1 } edit Not a complete solution, but may help. Nick n.j.cox@durham.ac.uk -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Erik Aadland Sent: 07 March 2012 11:15 To: statalist@hsphsun2.harvard.edu Subject: st: generating variables based on the co-occurrence of ids in groups over time Dear Statalist. I am struggling to generate two variables based on the co-occurrence of ind_ids in project_ids over time (yearmonth). Structure of my data is as follows: yearmonth project_id ind_id 5 1 1 5 1 2 5 1 3 5 2 1 5 2 4 5 2 5 6 3 1 6 3 2 6 3 5 6 4 4 6 4 5 6 4 6 7 5 1 7 5 4 7 5 5 7 5 2 The two variables I need to generate are: X (no. of prior collaborators in project for each ind_id): how many of the other individuals in project_id each ind_id has previously collaborated with (i.e. how many of the other ind_ids in the current project that each focal ind_id has co-occurred with in other projects in previous yearmonths) Z (total prior collaborations in project for each ind_id): the total number of times each ind_id has previously collaborated with the given other individuals in project_id (i.e. the total number of times each focal ind_id has co-occurred with other ind_ids in the current project in previous yearmonths) I have added varible X and Z scores to the data structure example below: yearmonth project_id ind_id X Z 5 1 1 5 1 2 5 1 3 5 2 1 5 2 4 5 2 5 6 3 1 2 2 6 3 2 1 1 6 3 5 1 1 6 4 4 1 1 6 4 5 1 1 6 4 6 0 0 7 5 1 3 5 7 5 4 2 3 7 5 5 3 5 7 5 2 2 3 Any and all input to these problems would be greatly appreciated. I use Stata 10 and the panel data is unbalanced. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/