Another way of doing this, without any new
variables:
bysort ID (Cost) : drop if missing(Cost[_N])
Nick
[email protected]
Antoine Terracol
> I would try something like :
>
> generate tag=(cost==.)
> egen toberemoved=sum(tag), by(ID)
> drop if toberemoved>0
> drop tag toberemoved
>
>
> You will need to replace the "cost==." in the fisrt line by a more
> general way to tag your erroneous values (such as "cost==. |
> cost>9999")
Murray Lowe
> > I am working with a large dataset and have discovered that
> some of the data
> > are missing values or have erroneous values. The data is
> panel data with
> > observations per individual over a 5 year period. For example:
> >
> > ID Year Cost
> >
> > 1 1 100
> > 1 2 200
> > 1 3 500
> > 1 4 150
> > 1 5 x
> > 2 1 100
> > 2 2 200
> > 2 3 500
> > 2 4 600
> > 2 5 100
> >
> > The problem is this: If an individual has a missing /
> erroneous value for a
> > particular year, I want to exclude ALL of their
> observations from the
> > dataset. In the example patient 1 would be removed from the dataset
> > entirely. How can this be done through an automated-type process?
> > Essentially I need a code / method that looks for the
> anomalous data;
> > identifies the patient and then removes all of their
> observations from the dataset.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/