Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: Drop Duplicates while simultaneously eliminating opposite positive and negative values
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: RE: Drop Duplicates while simultaneously eliminating opposite positive and negative values
Date
Sun, 11 Nov 2012 10:24:43 +0000
I didn't try to understand all of what you want, but this might help.
Positive and negative prices will sort to opposite ends of blocks of
observations, so you can flag and count
bysort make mpg (price) : gen pos = price == price[_N] & price > 0
egen npos = total(pos), by(make mpg)
bysort make mpg (price) : gen neg = price == price[1] & price < 0
egen nneg = total(neg), by(make mpg)
Nick
On Sun, Nov 11, 2012 at 9:38 AM, Beatrice Benavidez
<[email protected]> wrote:
> I have this interesting problem where I would have the following dataset -
>
> make price mpg
> VW Diesel 5397 41
> BMW 320i 9735 25
> Datsun 510 5079 24
> Audi 5000 9690 17
> BMW 320i -9735 25
> BMW 320i 9375 25
> BMW 320i 9375 25
> BMW 320i 9735 25
> BMW 320i 9735 25
> VW Diesel - 5397 41
> BMW 320i 9735 25
>
> The dataset has opposite positive and negative price values for the
> common make and mpg (such as VW Diesel Price=5397 mpg=41 & VW Diesel
> Price=-5397 mpg=41) while at the same time there are duplicates for
> all make, price and mpg (BMW 320i Price=9375 mpg=25 appearing twice).
>
> The opposite positive and negative price values for the common make
> and mpg can also happen within duplicates based on all make, price and
> mpg (BMW 320i Price=9735 mpg=25 appearing 4 times & BMW 320i
> Price=-9735 mpg=25 appearing once).
>
> I know how to proceed with the identification and flagging of
> duplicate observations based on
> http://www.stata.com/support/faqs/data-management/duplicate-observations/
>
> I would like to be able to make a flag variable for both the opposite
> positive and negative price values for the common make and mpg, while
> only keeping one observation if there are duplicates for all make,
> price and mpg.
>
> At the same time, if there are 2 duplicated positive price values when
> there is one opposite negative price value for the common make and
> mpg, I would like to flag one positive price value observation and the
> opposite negative price value counterpart. Vice versa would apply if
> there are 2 duplicated negative price values and one opposite positive
> price value, I would want to flag one negative price value observation
> and the opposite positive price value observation.
>
> Expanding on this in the general case, if there are more duplicated
> positive price values than there are opposite negative price values
> for the common make and mpg (duplicated or not), I would like to flag
> all but one of the positive price value observation and (all) opposite
> negative price value observation(s) for the common make and mpg. Vice
> versa would apply if there are more duplicated negative price values
> than there are opposite positive price values for the common make and
> mpg.
>
> I would like to flag all but ONE of either positive or negative price
> value observations if the bigger number of duplicated sign groups are
> the positive or negative price values respectively.
>
> How should I proceed if I want to execute a flagging procedure for all
> these three different situations simultaneously without missing
> anything out?
>
> Any help will be appreciated!
>
> Thanks a lot!
>
>
> Beatrice
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/