Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: Dropping Duplicates that Aren't Exactly Duplicates
From
Lisa Chavez <[email protected]>
To
[email protected]
Subject
Re: st: RE: Dropping Duplicates that Aren't Exactly Duplicates
Date
Wed, 02 Nov 2011 11:50:23 -0700
Thank you for your reply. I tried that before but what happened was
that I ended up dropping rows that I didn't want to drop. For
example, say a person has three arrest events with four violations
each. The first two arrest events have the exact same violations and
the third arrest has two violations but ONE of violations in the third
arrest was the same as one violation in one of the first two arrest
events. The result is that I dropped out a single violation out of
the third arrest event (and I wanted the third arrest untouched). --Lisa
On 11/2/2011 11:32 AM, Nick Cox wrote:
In general, you are in charge. You get to define what counts as a duplicate you want to drop.
Also, you can drop duplicates using any syntax you want that does the job.
The -duplicates- command is the way of dealing with duplicates with which I am most familiar. I think you want to
duplicates drop id violation, force
Nick
[email protected]
Lisa Chavez
I have data in long file format that has three variables: id, arrdate
and violation.
Below is an example of a person who has three arrest events (I have
separated them with lines).
Looking at the first two arrest dates (11mar2004 and 13jan2005) you see
that each arrest has three violations and they are exactly the same.
I have lots of examples like this one; in all instances I want to drop
the last arrest event where this duplication occurs.
In the case below, I would want to drop all rows associated with the
13jan2005 arrest event.
I'd appreciate any help you can offer.
Thanks!
Lisa
+----------------------------------------------------------------------------------------+
id
arrdate violation
----------------------------------------------------------------------------------------
A0000518 11mar2004 Cocaine-Possess
Possess Cocaine
A0000518 11mar2004 Nonmoving Traffic Viol Drive While Lic Susp
Habitual Offender
A0000518 11mar2004 Traffic Offense Dui Alcohol Or
Drugs 1St Off
----------------------------------------------------------------------------------------
A0000518 13jan2005 Cocaine-Possess
Possess Cocaine
A0000518 13jan2005 Nonmoving Traffic Viol Drive While Lic Susp
Habitual Offender
A0000518 13jan2005 Traffic Offense Dui Alcohol Or
Drugs 1St Off
----------------------------------------------------------------------------------------
A0000518 27feb2009
Hallucinogen-Sell Schedule Ii
+----------------------------------------------------------------------------------------+
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/