There's no easy and failsafe solution here.
-merge- doesn't know about meanings or approximate matches. It's entirely literal.
I can think of two strategies.
1. You just need to work on one or indeed both datasets to produce variables that will merge. There's detailed advice within
SJ-8-3 dm0039 . . . Stata tip 64: Cleaning up user-entered string variables
. . . . . . . . . . . . . . . . . . . . . . . . J. Herrin and E. Poen
Q3/08 SJ 8(3):444--445 (no commands)
tip on how to clean up user-entered string variables
2. You could try soundex or similar tricks. Your example doesn't look encouraging for that strategy.
Nick
[email protected]
Meryle Weinstein, Ph.D.
I have two datasets that Im trying to merge by the following string
variables: agencyname sitename siteaddress. There are slight differences
in the datasets, particularly in the agencyname and sitename variables so
I'm having trouble merging the two datasets. The problem seems to be that
the agencyname differs slightly in each of the datasets. For example
Dataset2 dataset2
68th precinct youth council inc 68th precinct
youth council, inc.
action center for education and community development, inc action center
for education and community development-ps 106
amistad child day care and family center inc amistad early
childhood educational center inc
Any suggestions on a way to merge by these three variables would be
appreciated.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/