[email protected]
> I have a data set like the following.
>
> zipcode region quantity revenue flag month year
> "37075","11",1,0,"EXP","03","2002
> "37075","04",53,0,"EXP","03","2002"
> "38242","04",14,0,"EXP","03","2002"
> "38242","11",1,0,"EXP","03","2002"
> "70503","06",25,0,"EXP","03","2002"
> "70503","11",1,0,"EXP","03","2002"
>
> I would like to replace the value of region by its value corresponding to
> higher quantity by zipcode. For example, for zipcode "37075", I
> would like to
> replace the region's value of "11" by "04" since region "04" has higher
> quantity. The new data set will look like the following:
>
> "37075","04",1,0,"EXP","03","2002
> "37075","04",53,0,"EXP","03","2002"
> "38242","04",14,0,"EXP","03","2002"
> "38242","04",1,0,"EXP","03","2002"
If you -sort zipcode quantity- then the last value within each
-zipcode- will have the highest quantity, so you can assign
the corresponding -region- to all observations for that
-zipcode-. This is all achievable within one line:
bysort zipcode (quantity) : replace region = region[_N]
However, if any -quantity- is missing, this will mess
up things.
So we need to protect against that:
gen present = !missing(quantity)
bysort zipcode (present quantity): replace region = region[_N]
Taking that more slowly,
missing(quantity) is 1 if quantity is missing and 0 otherwise
!missing(quantity) flips that round: it is 1 is quantity
is non-missing and 0 otherwise.
The -sort- is hierarchical.
* First, we -sort- on -zipcode-.
* Within -zipcode- we sort on -present-. Thus all the missing
values of -quantity- go first.
* Within -present- we sort on -quantity-. Thus the last
value of -quantity- will be the highest non-missing value
of -quantity- for that -zipcode-.
We look across the observation, pick up the value of -region-
and smear it across all the observations for that -zipcode-.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/