Philipp Rehm <[email protected]>:
This seems to be the desirable outcome. If you specify a merge
matched on a variable with missing values, you expect the missing
values to be matched. If you specify uniqusing in your example, it
should not change the behavior since there is only one missing value
in the using file. If you want missing values not to be merged, and
you have only one type of missing in both files, you can redefine one
or both of them so they no longer match, e.g.
replace id=.a if id==.
or drop the obs with missing ids, but this is a choice you should
make, not -merge-.
On 11/3/07, Philipp Rehm <[email protected]> wrote:
I am regularly puzzled by a particular feature of -merge-, namely to
match missing observations with each other. Here is an example:
sysuse auto, clear
sort price
keep in 1/15
replace foreign=. in 1/5
preserve
collapse (mean) PRICE=price, by(foreign)
sort foreign
list
tempfile m
save `m'
restore
sort foreign
merge foreign using `m'
list foreign PRICE
I can avoid this problem in various ways (a "drop if foreign==." after
the -collapse- would be one option). I also understand that Stata reads
missing values as very large numbers (i.e.: all nonmissing numbers < . <
.a < .b < ... < .z). I do not understand, however, why it matches
missing values with each other. Moreover, the same behavior persists
when I specify the -merge- option "uniqusing".
Let me add that this behavior does not seem as strange in the example
above. However, I usually -merge- data from totally different
data-sources. There is no logical pattern to the missing values, and no
reason to match them.
Am I missing something? Clarifications are appreciated.
Thanks,
Philipp
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/