Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Matching new and old id's in multiple years to create a panel

From   "Lacy,Michael" <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Re: st: Matching new and old id's in multiple years to create a panel
Date   Tue, 14 Aug 2012 19:17:18 +0000

>From 	  Kirk Geale <[email protected]>
>To 	  [email protected]
>Subject 	  st: Matching new and old id's in multiple years to create a panel
>Date 	  Sat, 11 Aug 2012 23:57:03 -0400
>I have separate files containing a year of data where each respondent
>has a unique id.  In subsequent years, the id number for the same
>person is different.  However, in a given year there is an id variable
>that matches the previous year's id, called id2.  As an example, the
>same respondent could have an id code 14503 in the year 2000, but
>94837 in the year 2001.  In 2001, there is a second variable id2 that
>records 14503, which is the respondent's id in 2000.  If the
>respondent was not surveyed in the previous year, id2 is coded as 0.
>This occurs in all years.  I want to match these individuals from year
>to year by generating a new variable that identifies them as the same
>person (where appropriate), for the end goal of survival analysis.  I
>initially thought this would be very easy to do, but I can't seem to
>get it.  Thanks for any ideas!
>Kirk Geale

Is this like what you had in mind?

//Example Data: Some individuals enter the panel every year, but each 
// one has a record for every year after the entry year.  I'm presuming
// that years are consecutive integers.
// Note that for regularity, I made "previous id" = 0 even for year 1.
input year prev_id curr_id  // prev_id is more convenient than id2
1 0 132 
2 132 976
3 976 804
4 804 271
1 0 765 
2 765 317
3 317 887
4 887 302
2 0 387 
3 387 701
4 701 523
2 0  654 
3 654 124
4 124 972
3 0 298
4 298 321
4 0 820
// Put data into separate files to fit what Kirk already has.  
// I'll also do something he probably doesn't have yet, 
// which is to name each file with year as a suffix, and 
// to rename prev_id and curr_id according to the actual year. 
levelsof year, local(ylist)
qui summ year
local lastyear = r(max)  // I'll need this value
local firstyear = r(min)
quiet {
	foreach y of local ylist {
	  tempfile file`y'
	  keep if (year == `y')
	  local ym1 = `y' -1
	  rename curr_id id`y'
	  rename prev_id id`ym1'
	  save `file`y''
	  di "`file
// Now, we have the slightly massaged version of Kirk's data.
// The rest is a series of merges, linking ids year by year.
foreach y of local ylist {
   if (`y' < `lastyear') {
   	local next = `y' + 1
		qui merge 1:m id`y' using "`file`next''"   // the 1:m, not 1:1 matters
		drop _merge

// Clean up and put the data into person-year format
drop year id0  
qui recode id* (0=.)
gen entry_id = _n
reshape long id, i(entry_id) j(year)
drop if missing(id)
bysort entry_id (year) : replace entry_id = id[1]


Mike Lacy
Dept. of Sociology
Colorado State University
Fort Collins CO 80523-1784

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index