Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: nearmrg for strings (titles)
From
Eric Booth <[email protected]>
To
"<[email protected]>" <[email protected]>
Subject
Re: st: nearmrg for strings (titles)
Date
Tue, 30 Aug 2011 19:20:28 +0000
<>
I tried -nearmrg- (from SSC) using a string variable in both datasets for the merge var and got the same error as Michaela. It works if both variables are numeric, but when both are string, I get the error. I never noticed that -nearmrg- worked (or is supposed to work) with string matching variables before -- based on the help file, I thought it matched on numeric vars only (I've used it to match the nearest date), but I do now see the passing reference to string vars and the lower/upper options in the help file. Turning trace on, this code:
. nearmrg name using sample.dta, nearvar(name) lower //ref. to example below
produces this error:
= if "lower"!="" gen double __000004=cond(name!=__000002,__000001,__000002)
type mismatch
so there's probably some quotes missing in this line (around the temp vars?). I get the same error using the 'upper' option.
__
Instead I usually use -reclink- (from SSC) for this kind of matching. I haven't tried Dan's -imatch- for this purpose.
Here's an example using -reclink-:
*****************!
clear
inp str20 name
"manuela Hech"
"Chris Mueller"
"Fanzisa Haller "
"Ulrike Loerr"
end
g x = 1
g idusing = _n
replace name = trim(lower(name))
sa "sample.dta", replace
clear
inp str20 name
"manuela Hecher"
"Christian Mueller"
"Fanzisa Haller "
"Ulrike Loerr"
end
g y = 0
g idmaster = _n
replace name = trim(lower(name))
//nearmrg: produces "type mismatch" error
*nearmrg name using sample.dta, nearvar(name) lower
//reclink
reclink name using sample.dta, idmaster(idmaster) ///
idusing(idusing) gen(_match) minscore(.75)
li name Una _match
*****************!
- Eric
__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
[email protected]
Office: +979.845.6754
On Aug 30, 2011, at 2:43 AM, Nick Cox wrote:
> Perhaps -name- is string in one dataset and a numeric variable with
> value labels in the other. Alternatively, there is some such clash
> between datasets.
>
> -nearmrg- is a user-written program from SSC. Please remember to
> specify where user-written programs you refer to come from.
>
> Nick
>
> On Tue, Aug 30, 2011 at 7:57 AM, Hoecher, Michaela (0613xxx)
> <[email protected]> wrote:
>
>> I would like to merge two datasets (variables: title, date, publisher).
>> The problems is, that strings (tiltes of a book), that are not absolutely the same sould be merged/matched.
>> - Does it make sense to use nearmrg for this?
>> - In which way are strings merged/matched?
>> - What would you recommend me?
>>
>> - I wanted to test nearmrg, but I got an error message "type mismatch":
>>
>> string_masterfile.dta
>> +--------------------------------------+
>> | id name gender age
>> |---------------------------------------
>> | 5 franzi 1 23
>> | 1 meli 1 32
>> | 2 michaela 1 20
>> | 6 ali 2 25
>> | 3 christ 2 20
>> | 4 martin 2 44
>> +--------------------------------------+
>>
>> string_matchfile2.dta
>> +---------------------------------------+
>> | id name gender age
>> |----------------------------------------
>> | 5 franzi 1 13
>> | 1 michi 1 15
>> | 2 susi 1 22
>> | 4 ali 2 25
>> | 3 chris 2 20
>> | 5 felix 2 43
>> +---------------------------------------+
>>
>> When I use the command:
>> nearmrg gender using string_matchfile2.dta, nearvar(name) lower genmatch(samename)
>> or
>> nearmrg gender using string_matchfile2.dta, nearvar(name) lower force genmatch(samename)
>>
>> I geht the error message: "type mismatch"
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/