Dear Statalist,
I have some repeated observations that I need to drop. I have defined
a unique ID variable. There are two other variables "value" and
"month" for these IDs.
Example data.
ID Value month
1 25 12
1 22 8
2 30 9
3 28 8
3 24 6
Only few IDs are repeated in the actual data.
Basically I need to retain only one value per ID. If I do "drop if
ID==ID[_n-1]" after sorting the IDs it could drop any of the values.
But I want the data to retain those values where in month takes the
highest value when I am dropping ID==ID[_n-1]. Something like
drop min(month, month[_n-1]) if ID==ID[_n-1] is what I am looking for.
However, I cant use such time series operators with min function. Any
thoughts?
Regards,
Rijo.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/