Sorry, but it is clear that my attempt to be as precise as possible
failed. Mea culpa.
In response to Bill Gould-
Neither the production nor the investment dataset has any observations
for b1, nor a variable denoted b1 prior to beginning the merge. This is
only created through the link with r1/r2 at a later point.
In response to Michael Blasnik-
You are very right that the update option could work, and I have used it
as a work-around. Temporarily ignoring the strange effect of the link
working for one dataset, whereas not for the other despite the r1/r2
variables being exactly the same, does work. Essentially, by only
merging one (investment) of the datasets to get b1, and then merging on
r1/r2 with the other (production) I "borrow" the link from the former.
The problem with this is that former contains fewer observations than
the latter which, if possible, I do not want to lose.
Ultimately the question remains as to how the links can act so strangely
as depicted below:
Investment
/ \
Link =/= Production
Thanks to both for your time,
Julian
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Michael
Blasnik
Sent: 29 April 2003 15:45
To: [email protected]
Subject: Re: st: Merge problem
If I'm reading the original post correctly, dropping b1 may not be a
solution because there are two merges happening (production and
investment
data), each adding a b1 as needed. If this guess is correct, then I
think
that the solution would be to use the update option to merge, which will
replace missing b1 with b1 from the using data. An alternative would be
to
name b1 something else in the second file being merged (b2?) and then
compare b1 and b2 for any observations where both are present, just to
be
sure that there agreement. Then one could replace b1=b2 if b1==. and
then
drop b2.
Michael Blasnik
[email protected]
----- Original Message -----
From: "William Gould, Stata" <[email protected]>
To: <[email protected]>
Sent: Tuesday, April 29, 2003 10:07 AM
Subject: Re: st: Merge problem
> Julian Fennema <[email protected]> is having difficulty merging
datasets.
> To summarize (and put words in his mouth), he has a dataset
>
> p1992.dta containing identifying vars r1 r2 and other vars x1,
x2,
..
>
> and he has
>
> link.dta containing variables r1, r2, and b1
>
> He claims that in link.dta, r1, r2, and b1 are never missing.
>
> He merges the two datasets,
>
> . use p1992, clear
> . merge r1 r2 using link
>
> and he discovers that
>
> _merge==1 observations:
> look fine; have b1==.
>
> _merge==2 observations:
> look fine, have b1<. (i.e., not missing, has correct
values
> obtained from link.dta)
>
> _merge==3 observations:
> look fine in one sense, but have b1==., rather than the
correct
> values from link.dta
>
> I have a suspicion as to the problem:
>
> dataset p1992.dta already has a variable named b1 in it, and that
> variable has b1==.
>
> If I am right, then dropping the b1 variable before the merge will
solve
> the problem.
>
> When Stata joins two observations, it never replaces values in the
master
> dataset, the dataset in memory. Rather, it uses only the new values
> associated with the new variables.
>
> -- Bill
> [email protected]
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/