Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: Drop Duplicates while simultaneously eliminating opposite positive and negative values
From
Beatrice Benavidez <[email protected]>
To
[email protected]
Subject
st: RE: Drop Duplicates while simultaneously eliminating opposite positive and negative values
Date
Sun, 11 Nov 2012 13:38:04 +0400
Dear All,
I have this interesting problem where I would have the following dataset -
make price mpg
VW Diesel 5397 41
BMW 320i 9735 25
Datsun 510 5079 24
Audi 5000 9690 17
BMW 320i -9735 25
BMW 320i 9375 25
BMW 320i 9375 25
BMW 320i 9735 25
BMW 320i 9735 25
VW Diesel - 5397 41
BMW 320i 9735 25
The dataset has opposite positive and negative price values for the
common make and mpg (such as VW Diesel Price=5397 mpg=41 & VW Diesel
Price=-5397 mpg=41) while at the same time there are duplicates for
all make, price and mpg (BMW 320i Price=9375 mpg=25 appearing twice).
The opposite positive and negative price values for the common make
and mpg can also happen within duplicates based on all make, price and
mpg (BMW 320i Price=9735 mpg=25 appearing 4 times & BMW 320i
Price=-9735 mpg=25 appearing once).
I know how to proceed with the identification and flagging of
duplicate observations based on
http://www.stata.com/support/faqs/data-management/duplicate-observations/
I would like to be able to make a flag variable for both the opposite
positive and negative price values for the common make and mpg, while
only keeping one observation if there are duplicates for all make,
price and mpg.
At the same time, if there are 2 duplicated positive price values when
there is one opposite negative price value for the common make and
mpg, I would like to flag one positive price value observation and the
opposite negative price value counterpart. Vice versa would apply if
there are 2 duplicated negative price values and one opposite positive
price value, I would want to flag one negative price value observation
and the opposite positive price value observation.
Expanding on this in the general case, if there are more duplicated
positive price values than there are opposite negative price values
for the common make and mpg (duplicated or not), I would like to flag
all but one of the positive price value observation and (all) opposite
negative price value observation(s) for the common make and mpg. Vice
versa would apply if there are more duplicated negative price values
than there are opposite positive price values for the common make and
mpg.
I would like to flag all but ONE of either positive or negative price
value observations if the bigger number of duplicated sign groups are
the positive or negative price values respectively.
How should I proceed if I want to execute a flagging procedure for all
these three different situations simultaneously without missing
anything out?
Any help will be appreciated!
Thanks a lot!
Beatrice
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/