Another way of doing this, without any new variables:
bysort ID (Cost) : drop if missing(Cost[_N])
Nick [email protected]
Antoine Terracol
I would try something like :
generate tag=(cost==.)
egen toberemoved=sum(tag), by(ID)
drop if toberemoved>0
drop tag toberemoved
You will need to replace the "cost==." in the fisrt line by a more
general way to tag your erroneous values (such as "cost==. | cost>9999")
Murray Lowe
I am working with a large dataset and have discovered that
some of the data
are missing values or have erroneous values. The data is
panel data with
observations per individual over a 5 year period. For example:
ID Year Cost
1 1 100
1 2 200
1 3 500
1 4 150
1 5 x
2 1 100
2 2 200
2 3 500
2 4 600
2 5 100
The problem is this: If an individual has a missing /
erroneous value for a
particular year, I want to exclude ALL of their
observations from the
dataset. In the example patient 1 would be removed from the dataset
entirely. How can this be done through an automated-type process?
Essentially I need a code / method that looks for the
anomalous data;
identifies the patient and then removes all of their
observations from the dataset.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/