Hi all,
I have this ( a subset of the original dataset, the original dataset has
about 6000 ids with an average of 6 years each)
id year
4 1987
4 1988
4 1989
4 1990
4 1992
4 1993
4 1994
9 1987
9 1988
9 1989
9 1990
9 1992
9 1993
9 1994
I need to keep years if they are more than 2 years apart by company. In an
earlier post
http://www.stata.com/statalist/archive/2006-02/msg00952.html
David Harrison suggested the following which works fine with one id
local refdate = year[1]
gen byte dropflag = 0
forvalues i = 2/`=_N' {
if year[`i']-`refdate'<=2 {
replace dropflag = 1 in `i'
}
else {
local refdate = year[`i']
}
}
drop if dropflag
drop dropflag
However I am not able to do it by id. what I would want is this..
id year
4 1987
4 1990
4 1993
9 1987
9 1990
9 1993
I tried using the above code with levelsof and foreach as shown below but no
luck.
*******************
input id year
4 1987
4 1988
4 1989
4 1990
4 1992
4 1993
4 1994
9 1986
9 1987
9 1988
9 1989
9 1990
9 1992
9 1993
9 1994
9 1995
9 1996
9 1997
end
levelsof id, local(levels)
foreach l of local levels {
qui sum year
local refdate = year[1]
gen byte dropflag = 0
forvalues i = 2/`=_N' {
if year[`i']-`refdate'<=2 {
replace dropflag = 1 in `i'
}
else {
local refdate = year[`i']
}
}
drop if dropflag
drop dropflag
}
list, table clean noobs
*******************
which gives me
id year
4 1987
4 1990
4 1993
9 1996
I tried a variation of the above as
**Snip**
levelsof id, local(levels)
foreach l of local levels {
qui sum year if id==`l'
local refdate = r(min)
local m= r(N)
gen byte dropflag = 0
forvalues i = 2/`m'{
if year[`i']-`refdate'<=2 {
replace dropflag = 1 in `i'
}
else {
local refdate = year[`i']
}
}
drop if dropflag & id==`l'
drop dropflag
}
**Snip**
That gives me
id year
4 1987
4 1990
4 1993
9 1995
9 1996
9 1997
Any pointers much appreciated
Thanks
rajesh
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/