Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: How to merge datasets when there are missing values in the matching variables
From
shihying yao <[email protected]>
To
[email protected]
Subject
st: How to merge datasets when there are missing values in the matching variables
Date
Sat, 21 Jan 2012 14:55:54 -0500
Hi there,
I am trying to merge two data files using two unique ID variables, ID1
and ID2. Note that not all of the subjects have both ID1 and ID2
information in both files. Suppose the names of the data files are
"master" and "subset." Below resembles the code I used:
use subset, clear
sort ID1 ID2
save subset,replace
use master, clear
sort ID1 ID2
merge ID1 ID2 using subset
The problem occurs for subjects whose ID1 information is missing in
one of the data files (either one). Although these subjects can be
uniquely identified using ID2 in both files, their records are not
merged and there are duplicate records (i.e., one record has both ID1
and ID2 information, while the other record has ID2 information and
ID1 missing) in the merged file. It doesn't help whether I sort ID1 or
ID2 first, since some subjects have ID2 information in only one file.
The version I am using is STATA 10. Any help is appreciated.
Best,
Shihying
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/