Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Matching new and old id's in multiple years to create a panel
From
"Lacy,Michael" <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Matching new and old id's in multiple years to create a panel
Date
Tue, 14 Aug 2012 19:17:18 +0000
>From Kirk Geale <[email protected]>
>To [email protected]
>Subject st: Matching new and old id's in multiple years to create a panel
>Date Sat, 11 Aug 2012 23:57:03 -0400
>
>Hello,
>
>I have separate files containing a year of data where each respondent
>has a unique id. In subsequent years, the id number for the same
>person is different. However, in a given year there is an id variable
>that matches the previous year's id, called id2. As an example, the
>same respondent could have an id code 14503 in the year 2000, but
>94837 in the year 2001. In 2001, there is a second variable id2 that
>records 14503, which is the respondent's id in 2000. If the
>respondent was not surveyed in the previous year, id2 is coded as 0.
>This occurs in all years. I want to match these individuals from year
>to year by generating a new variable that identifies them as the same
>person (where appropriate), for the end goal of survival analysis. I
>initially thought this would be very easy to do, but I can't seem to
>get it. Thanks for any ideas!
>
>Kirk Geale
>
Is this like what you had in mind?
//Example Data: Some individuals enter the panel every year, but each
// one has a record for every year after the entry year. I'm presuming
// that years are consecutive integers.
// Note that for regularity, I made "previous id" = 0 even for year 1.
//
clear
input year prev_id curr_id // prev_id is more convenient than id2
1 0 132
2 132 976
3 976 804
4 804 271
1 0 765
2 765 317
3 317 887
4 887 302
2 0 387
3 387 701
4 701 523
2 0 654
3 654 124
4 124 972
3 0 298
4 298 321
4 0 820
end
compress
// Put data into separate files to fit what Kirk already has.
// I'll also do something he probably doesn't have yet,
// which is to name each file with year as a suffix, and
// to rename prev_id and curr_id according to the actual year.
levelsof year, local(ylist)
qui summ year
local lastyear = r(max) // I'll need this value
local firstyear = r(min)
//
quiet {
foreach y of local ylist {
preserve
tempfile file`y'
keep if (year == `y')
local ym1 = `y' -1
rename curr_id id`y'
rename prev_id id`ym1'
save `file`y''
di "`file
restore
}
}
// Now, we have the slightly massaged version of Kirk's data.
// The rest is a series of merges, linking ids year by year.
//
foreach y of local ylist {
if (`y' < `lastyear') {
local next = `y' + 1
qui merge 1:m id`y' using "`file`next''" // the 1:m, not 1:1 matters
drop _merge
}
}
// Clean up and put the data into person-year format
drop year id0
qui recode id* (0=.)
gen entry_id = _n
reshape long id, i(entry_id) j(year)
drop if missing(id)
bysort entry_id (year) : replace entry_id = id[1]
Regards,
Mike Lacy
Dept. of Sociology
Colorado State University
Fort Collins CO 80523-1784
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/