Your most common values can be obtained by
bysort code1 code2 : gen count = - _N    [!!! NB - ]
bysort code1 (count code2) : gen mode = code2[1]
Nick
[email protected]
Jason Hwang
> I didn't describe very well last time what I wanted to do. Let me try
> again.
>
> I have two datasets I'm trying to merge of the following form.
>
> dataset1:
>
> code1 output
> 1111  100
> 5555  340
>
> dataset2:
>
> code2 pchange code1
> 3431  .5      1111
> 3431  .5      1111
> 3450  -.5     1111
> 3451  .7      1111
> 9903  .4      5555
> 9945  .1      5555
> 9903  .4      5555
> 9905  -.6     5555
> 9945  .1      5555
>
> I'm trying to use dataset1 as the original (master) and merge into it
> dataset2. Problem: each code1 maps to many code2s. So here's
> what I would
> like to do: for each code1, find a code2 which corresponds to
> it with the
> greatest frequency. So for code1, 1111, I want 3431. For
> 5555, both 9903
> and 9945 occur twice. In this case, I'll just take whichever shows up
> first in the sorted list; i.e. 9903.
>
> The final output I'm looking for would be:
>
> code1 code2   output  pchange
> 1111  3431    100     .5
> 5555  9903    340     .4
>
> Could some one how to write a code for this procedure? Thank you very
> much.
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
--------------------------------------------------------