Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: conditional merging
From
"Ben Hoen" <[email protected]>
To
<[email protected]>
Subject
st: conditional merging
Date
Tue, 6 Nov 2012 14:33:35 -0500
I have two files sales.dta and condition.dta. sales.dta has two variables
(home_id saleyear), and condition.dta has three variables (home_id
inspection_year condition). The variable inspection_year can take the vales
of 2000-2011 for any home but for many homes only some years are present (in
many years the home was not inspected. Therefore a sample of the data might
look like:
home_id inspection_year condition
50121 2002 4
50121 2006 4
50121 2011 3
50681 2004 2
50681 2010 3
51040 2006 2
51040 2010 2
51040 2011 3
I would like to populate the sales.dta file with the condition of the parcel
in the inspection_year that is the closest to, but not following the
sale_year.
So, for example, the following dataset would result
home_id sale_year condition
50121 2007 4
50121 2011 3
50681 2008 2
51040 2003 .
51040 2010 3
I know I am not the first person to have this problem, but could not find
threads on this. Maybe I am using the wrong search terms. Any help would
be greatly appreciated.
(As I wrote this I realized one not as elegant work-around would be to
fill-in missing data for each missing year in the condition.dta file,
potentially using the user-written "carryforward" or even imputing the data
using, e.g., mi impute, and then matching home_id sale_year to home_id
inspection_year.)
Thanks, in advance!
Ben Hoen
Principal Research Associate
Lawrence Berkeley National Laboratory
Office: 845-758-1896
Cell: 718-812-7589
[email protected]
http://emp.lbl.gov/staff/ben-hoen
Visit our publications at:
http://emp.lbl.gov/publications
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/