The manual entry for -merge- (Stata 10) describes
the contents of _merge when more than one -using-
dataset is used:
"_merge is the standard result variable that
we have discussed before: 1 means that the observation
came from the master, 2 means that it came from the
using, and 3 means that it came from both."
While I think the behaviour described here for _merge is what
I would expect - "3 means it came from both [master and using]"
- this isn't in fact what _merge contains. The example indicates
otherwise, as does the online documentation:
_merge==3 obs. from at least two datasets, master or using
That is, _merge==3 could mean it came from the two usings, but
not the master at all.
My complaint is two-fold. First, the online and printed documention
should agree (I hope this is uncontroversial!). Second, I think it
would be more logical for _merge to have consistent behaviour
whether there are 1 or 2 using files - it should be 3 only if
there is a master and (at least one) using record. The terminology
employed by -merge- ("master" and "using") gives decided precedence
to the "master" dataset, so why not _merge?. Then _merge1, _merge2
etc can clarify which using set the obs came from.
Jeph
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/