Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: _merge complaint


From   Svend Juul <[email protected]>
To   [email protected]
Subject   Re: st: _merge complaint
Date   Fri, 21 Mar 2008 19:48:59 +0100

Jeph Herrin wrote:
 
The manual entry for -merge- (Stata 10) describes
the contents of _merge when more than one -using-
dataset is used:
 
 "_merge is the standard result variable that
 we have discussed before: 1 means that the observation
 came from the master, 2 means that it came from the
 using, and 3 means that it came from both."
 
While I think the behaviour described here for _merge is what
I would expect  - "3 means it came from both [master and using]"
- this isn't in fact what _merge contains.  The example indicates
otherwise, as does the online documentation:
 
 _merge==3    obs. from at least two datasets, master or using
 
That is, _merge==3 could mean it came from the two usings, but
not the master at all.
 
My complaint is two-fold. First, the online and printed documention
should agree (I hope this is uncontroversial!). Second, I think it
would be more logical for _merge to have consistent behaviour
whether there are 1 or 2 using files - it should be 3 only if
there is a master and (at least one) using record. The terminology
employed by -merge- ("master" and "using") gives decided precedence
to the "master" dataset, so why not _merge?. Then _merge1, _merge2
etc can clarify which using set the obs came from.
 
========================================================
 
I did not believe Jeph was right, but he is, both on the behavior of 
-merge- and on the documentation. When merging multple using datasets 
in one command, there is no diagnostic for the case of a missing 
master observation.
 
This leads to the advice: Don't merge multiple using datasets
in one -merge- command! Instead, make one -merge- command
for each using dataset and check after each merge.
 
Svend
 
________________________________________________________ 
 
Svend Juul
Institut for Folkesundhed, Afdeling for Epidemiologi
(Institute of Public Health, Department of Epidemiology)
Vennelyst Boulevard 6 
DK-8000 Aarhus C,  Denmark 
Phone, work:   +45 8942 6090 
Phone, home:   +45 8693 7796 
Fax:           +45 8613 1580 
E-mail:        [email protected] 
_________________________________________________________ 


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index