On Nov 22, 2008, at 4:12 PM, Kir1 wrote:
Perhaps the subject line is not the best one to address my problem,
but here it is:
My original data set is of the 'long' format like this
gvkey year revenue
1000 1999 111
1000 2000 222
.
.
1001 1999 888
From this I want to retrieve only one value of revenue per 'gvkey'
which corresponds to a year value that i lookup in a different
database. So if in the other data base my gvkey is 1000 and
lookupyear is 2000, then only 222 should be in the revenue cell of
the final database.
The simplest way to handle this is to merge the two datasets by gvkey
and year, and then just keep those observations for which _merge==3.
For example, assuming that the variables in your other dataset are
named gvkey and lookupyear, you would do
gen lookupyear = year
merge gvkey lookupyear using <other dataset name>, sort
keep if _merge==3
This will leave you with what you want.
-- Phil
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/