-duplicates- is in part designed for this
very purpose!
Nick
[email protected]
Philipp Rehm
> You could do this:
>
> collapse (sd) fvc fev1, by(patid)
>
> /* this calculates the standard deviation on each variable of
> interest
> for each patient. Clearly, you want the standard deviation to
> be zero */
>
> drop if fvc==0 & fev1==0
>
> /* this drops all observations with identical entries and leaves only
> partients in the data-set with mistyped data (assuming, of
> course, that
> the data have not been mistyped twice */
Philip Ryan
> > Separate your data set into two and use Tom Steichen's
> -cf3- program, available
> > using -ssc-
Seb Buechte
> >>we have been entering lung function data over the course of the last
> >>few month. To be able to identify mistyped data we have entered all
> >>data twice. So for a patient who received a lung function
> measurement
> >>on 12 Jan 2002 I have two observation holding all the data
> gained from
> >>that measurement.
> >>
> >>I am now wondering wether there is an easy way to compare those two
> >>observations? The data could look like this
> >>
> >>patid lufudate fvc fev1
> >>1 10jan2002 5.1 4.8
> >>1 10jan2002 5.1 4.3
> >>2 09jan2002 4.3 3.7
> >>2 09jan2002 4.3 3.7
> >>3 08jan2002 2.9 2.6
> >>3 08jan2002 2.9 6.2
> >>
> >>(This is articial data that I just made up. The "real" data contains
> >>approx. 20 parameters)
> >>
> >>So is there any program out there that would help me with this task?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/