Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: How to merge datasets when there are missing values in the matching variables
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: How to merge datasets when there are missing values in the matching variables
Date
Sat, 21 Jan 2012 20:03:30 +0000
It sounds as if you need to clean up afterwards. I don't see that you
can expect -merge- to do the right thing in this circumstance.
-duplicates- offers handles for dealing with duplicate observations.
Nick
On Sat, Jan 21, 2012 at 7:55 PM, shihying yao <[email protected]> wrote:
> Hi there,
> I am trying to merge two data files using two unique ID variables, ID1
> and ID2. Note that not all of the subjects have both ID1 and ID2
> information in both files. Suppose the names of the data files are
> "master" and "subset." Below resembles the code I used:
>
> use subset, clear
> sort ID1 ID2
> save subset,replace
>
> use master, clear
> sort ID1 ID2
> merge ID1 ID2 using subset
>
> The problem occurs for subjects whose ID1 information is missing in
> one of the data files (either one). Although these subjects can be
> uniquely identified using ID2 in both files, their records are not
> merged and there are duplicate records (i.e., one record has both ID1
> and ID2 information, while the other record has ID2 information and
> ID1 missing) in the merged file. It doesn't help whether I sort ID1 or
> ID2 first, since some subjects have ID2 information in only one file.
>
> The version I am using is STATA 10. Any help is appreciated.
>
> Best,
> Shihying
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/