Hi there,
I didn't describe very well last time what I wanted to do. Let me try
again.
I have two datasets I'm trying to merge of the following form.
dataset1:
code1 output
1111 100
5555 340
dataset2:
code2 pchange code1
3431 .5 1111
3431 .5 1111
3450 -.5 1111
3451 .7 1111
9903 .4 5555
9945 .1 5555
9903 .4 5555
9905 -.6 5555
9945 .1 5555
I'm trying to use dataset1 as the original (master) and merge into it
dataset2. Problem: each code1 maps to many code2s. So here's what I would
like to do: for each code1, find a code2 which corresponds to it with the
greatest frequency. So for code1, 1111, I want 3431. For 5555, both 9903
and 9945 occur twice. In this case, I'll just take whichever shows up
first in the sorted list; i.e. 9903.
The final output I'm looking for would be:
code1 code2 output pchange
1111 3431 100 .5
5555 9903 340 .4
Could some one how to write a code for this procedure? Thank you very
much.
Jason
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/