Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Deleting Duplicates based on criteria
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Deleting Duplicates based on criteria
Date
Thu, 11 Jul 2013 15:56:45 +0100
gen last = .
tokenize homicide sex robbery assault trafficking burglary larceny
motor sales weapon DUI possession other
qui forval j = 1/13 {
replace last = 14 - `j' if ``j'' == 1
}
bysort id (last) : keep if _n == _N
Nick
[email protected]
On 11 July 2013 15:43, Dirlam, Jonathan C. <[email protected]> wrote:
> Highest charge determined by this order: 1. Homicide, 2. Sex offense, 3. Robbery, 4. Agg Assault, 5. Drug Trafficking, 6. Burglary, 7. Larceny Theft, 8. Motor Vehicle Theft, 9. Drug Sales, 10. Weapon, 11. DUI, 12. Drug Possession, 13. Other
>
> Example of data with 3 of 13 dummies:
> Court case number id robberydummy burglarydummy homicidedummy
> 000000038CFMA 6 1 0 0
> 000000038CFMA 6 1 0 0
> 000000038CFMA 6 0 1 0
> 000000045CFMA 8 1 0 0
> 000000045CFMA 8 0 0 1
>
> In this example, I want one of the robbery observations for id=6 and the homicide observation for id=8.
> Thanks.
>
> ________________________________________
> From: [email protected] [[email protected]] on behalf of Nick Cox [[email protected]]
> Sent: Thursday, July 11, 2013 10:23 AM
> To: [email protected]
> Subject: Re: st: Deleting Duplicates based on criteria
>
> Yes, but tell us the rules for determining the highest charge and
> give us a realistic example of a block of observations for some court
> case. (Need not be real, just realistic.)
> Nick
> [email protected]
>
>
> On 11 July 2013 15:18, Dirlam, Jonathan C. <[email protected]> wrote:
>> Dear Statalist,
>> I have duplicate observations where the duplicates are the same court case number. I want to eliminate all the observations for a court case except for the observation that has the highest charge (homicide, robbery, etc.) I have 12 dummy variables that capture charges and used the duplicates command to get unique ids for each court case number. Is there a way to write a program that eliminates or keeps duplicates based on criteria you give it (Example, homicidedummy==1) and stops once all but one observation are eliminated?
>> Thanks.
>>
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
>
>
>
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/