Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: Drop Duplicates while simultaneously eliminating opposite positive and negative values
From
Michael Barker <[email protected]>
To
statalist <[email protected]>
Subject
Re: st: RE: Drop Duplicates while simultaneously eliminating opposite positive and negative values
Date
Mon, 12 Nov 2012 10:07:43 -0500
You could try something like this:
* Number positive and negative duplicate values of price independently:
bys make mpg price: gen dupid = _n
* Mark pairs by absolute value of price
gen absprice = abs(price)
duplicates tag make mpg absprice dupid , gen(dup_pair)
* Look for unpaired duplicates
duplicates tag make mpg price if dup_pair==0 , gen(dup_nonpair)
I'm not sure which of these you want to keep/drop, but I think this would
identify your three different groups:
1. unique: dupid==1 --or-- dup_pair==0 & dup_nonpair==0
2. pos/neg paired: dup_pair==1
3. additional pos or neg unpaired duplicates: dup_nonpair==1
Mike
On Sun, Nov 11, 2012 at 6:42 AM, daniel klein <[email protected]> wrote:
>
> First of, I am sorry for reposting, but the last message got corrupted
> in the archive (broke into two pieces and omitting the middle part).
> Here is the second (and final) try:
>
> Beatrice,
>
> this is kind of confusing. You say, you want to
>
> "[...] keep[ing] one observation if there are duplicates for all make,
> price and mpg."
>
> You then go on, specifying rules for cases in which
>
> "there are 2 duplicated positive price values when there is one
> opposite negative price value for the common make and mpg"
>
> But this is impossible. Given the first step, which elimintates all
> but one positive (or negative) price value in the subgroup defined by
> make and mpg, there can no longer be any cases that have 2 (or more)
> duplicated positive (or negative) price values in terms of make and
> mpg.
>
> From your description it further seems to be arbitrary which
> observations with positive or negative price values to flag. But in
> this case, why worry about positve and negative price values at all,
> when the only difference in these observations seem to be the
> multiplier (-1)?
>
> It is not that I mind playing a round with Stata -- on the contrary.
> But it migth help us help you, if you could comment on these
> statments, elaborate a little bit on the sequence of steps you want to
> take here, and maybe be more specific about your ultimate goal. An
> example dataset containing all the possibilities you have in mind
> would also be nice (only if your first example lacks any possible
> situation you want to tackle).
>
> Best
> Daniel
>
> --
> Dear All,
>
> [...]
> I would like to be able to make a flag variable for both the opposite
> positive and negative price values for the common make and mpg, while
> only keeping one observation if there are duplicates for all make,
> price and mpg.
>
> At the same time, if there are 2 duplicated positive price values when
> there is one opposite negative price value for the common make and
> mpg, I would like to flag one positive price value observation and the
> opposite negative price value counterpart. Vice versa would apply if
> there are 2 duplicated negative price values and one opposite positive
> price value, I would want to flag one negative price value observation
> and the opposite positive price value observation.
>
> Expanding on this in the general case,
> [...]
>
> Thanks a lot!
>
> Beatrice
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
--
Michael Barker
Department of Economics
Georgetown University
Washington, DC 20057
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/