| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: merge question
Amadou has a randomness problem:
I am trying to merge 2 datasets.
But everytime, I get different results
(_m==3 has 83 observations in the
first time, 97 in the second, 100 in the
third and 96 in the fourth, and so on).
I tried to set seed and made my sort, stable.
With no success. I also tried to recast double
my merging identifier. No success. I tried to
tostring it. No success either.
Any hints why I obtain these various results?
I verified in both Stata and Excel.
I do not understand why Stata marked 3 to some
observations that belonged to both datasets in the
first trial and not in the second time.
Best regards.
Amadou.
PS: When I work interractivelly, I do not have that problem.
I have 96 observations that matched. So what I am doing
wrong in my stata do file?
This is almost surely the result of a many-to-many merge which will
create exactly what he finds: a do-file that, when rerun, yields
different results (in terms of the number of obs.) every time.
Use the unique, uniqmaster or uniqusing options on merge, whichever
is appropriate. The merge key should be unique in one file or the
other, if not both.
"The dangers of many-to-many merges", p. 58 of the book cited below.
Kit Baum, Boston College Economics
http://ideas.repec.org/e/pba1.html
An Introduction to Modern Econometrics Using Stata:
http://www.stata-press.com/books/imeus.html
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/