Julian Fennema <[email protected]> is having difficulty merging datasets.
To summarize (and put words in his mouth), he has a dataset
p1992.dta containing identifying vars r1 r2 and other vars x1, x2, ..
and he has
link.dta containing variables r1, r2, and b1
He claims that in link.dta, r1, r2, and b1 are never missing.
He merges the two datasets,
. use p1992, clear
. merge r1 r2 using link
and he discovers that
_merge==1 observations:
look fine; have b1==.
_merge==2 observations:
look fine, have b1<. (i.e., not missing, has correct values
obtained from link.dta)
_merge==3 observations:
look fine in one sense, but have b1==., rather than the correct
values from link.dta
I have a suspicion as to the problem:
dataset p1992.dta already has a variable named b1 in it, and that
variable has b1==.
If I am right, then dropping the b1 variable before the merge will solve
the problem.
When Stata joins two observations, it never replaces values in the master
dataset, the dataset in memory. Rather, it uses only the new values
associated with the new variables.
-- Bill
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/