Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Deleting Duplicates based on criteria

From	Katie Farrin <[email protected]>
To	[email protected]
Subject	Re: st: Deleting Duplicates based on criteria
Date	Thu, 11 Jul 2013 10:56:45 -0400

Could you change your dummies into a single variable, with the most severe
charge taking the a value of 1 and lesser charges taking higher values?
Then you could sort by these values and run:

sort caseid charge

quietly by caseid:  gen dup = cond(_N==1,0,_n)

drop if dup>1

I'm sure there are more efficient ways to do this but just a suggestion.

Katie

On Thu, Jul 11, 2013 at 10:43 AM, Dirlam, Jonathan C.
<[email protected]> wrote:
> Highest charge determined by this order: 1. Homicide, 2. Sex offense, 3. Robbery, 4. Agg Assault, 5. Drug Trafficking, 6. Burglary, 7. Larceny Theft, 8. Motor Vehicle Theft, 9. Drug Sales, 10. Weapon, 11. DUI, 12. Drug Possession, 13. Other
>
> Example of data with 3 of 13 dummies:
> Court case number        id        robberydummy    burglarydummy    homicidedummy
> 000000038CFMA            6                 1                           0                         0
> 000000038CFMA            6                 1                           0                         0
> 000000038CFMA            6                 0                           1                         0
> 000000045CFMA            8                 1                           0                         0
> 000000045CFMA            8                 0                           0                         1
>
> In this example, I want one of the robbery observations for id=6 and the homicide observation for id=8.
> Thanks.
>
> ________________________________________
> From: [email protected] [[email protected]] on behalf of Nick Cox [[email protected]]
> Sent: Thursday, July 11, 2013 10:23 AM
> To: [email protected]
> Subject: Re: st: Deleting Duplicates based on criteria
>
> Yes,  but tell us the rules for determining the highest charge and
> give us a realistic example of a block of observations for some court
> case. (Need not be real, just realistic.)
> Nick
> [email protected]
>
>
> On 11 July 2013 15:18, Dirlam, Jonathan C. <[email protected]> wrote:
>> Dear Statalist,
>> I have duplicate observations where the duplicates are the same court case number. I want to eliminate all the observations for a court case except for the observation that has the highest charge (homicide, robbery, etc.) I have 12 dummy variables that capture charges and used the duplicates command to get unique ids for each court case number. Is there a way to write a program that eliminates or keeps duplicates based on criteria you give it (Example, homicidedummy==1) and stops once all but one observation are eliminated?
>> Thanks.
>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
>
>
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- RE: st: Deleting Duplicates based on criteria
  - From: "Dirlam, Jonathan C." <[email protected]>

References:
- st: Deleting Duplicates based on criteria
  - From: "Dirlam, Jonathan C." <[email protected]>
- Re: st: Deleting Duplicates based on criteria
  - From: Nick Cox <[email protected]>
- RE: st: Deleting Duplicates based on criteria
  - From: "Dirlam, Jonathan C." <[email protected]>

Prev by Date: st: Re: st: Some problems concerning nlsur！
Next by Date: Re: st: How to get mean coefficients and t-statistics from several regressions
Previous by thread: Re: st: Deleting Duplicates based on criteria
Next by thread: RE: st: Deleting Duplicates based on criteria
Index(es):
- Date
- Thread