
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Random merging

From   <[email protected]>
To   <[email protected]>
Subject   st: Random merging
Date   Tue, 28 Jul 2009 21:50:16 +0100

Hi all,
I'm a relatively new STATA user, and I'm trying to merge a couple of large datasets where neither the master nor the using dataset has a unique key. 
The data comes in this format:
Dataset 1:  (note that LINKIDX is not unique)
     EVNTIDX          LINKIDX           EVENTYR      EVENTMM    EVENTDD  ...
1.  300020190021   300020190083    2006                 8                     6
2.  300020190021   300020190052    2006                 8                     6 
3.  300110100795   300110101161    2006                 4                    10
4.  300110100822   300110101161    2006                 7                    19
5.  300110100808   300110101161    2006                 5                     8

Dataset 2:  (note that LINKIDX is not unique) 
    LINKIDX            DUPERSID     RXRECIDX  ...
1. 300020190083     30002019        300020190083001
2. 300020190083     30002019        300020198849002
3. 300110101161     30011010        300110101161001
4. 300110101161     30011010        300110101161003

I have already performed a merge where I have limited dataset 1 to only the unique observations of LINKIDX, and linked them to the multiple observations in dataset 2 (using a one-to-many merge). In the case of the above datasets, it would involve linking observation 1 in dataset 1 to observations 2 and 3 in dataset 2. 
However, I would like to perform a random link for the remaining observations. That is, for observations 3-5 in dataset 1, which match the LINKIDX for observations 3 and 4 in dataset 2, I would like for STATA to randomly pick a LINKIDX in dataset 1 to merge with each matching LINKIDX in dataset 2. 
I am not sure whether I should simply use the merge function, because it may result in systematic selection of one observation in dataset 1.
Any ideas as to how I might be able to accomplish this task?
Thank you in advance!
Anna Dijkstra

Please access the attached hyperlink for an important electronic communications disclaimer:

*   For searches and help try:

© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index