Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Reclink: high matching score, but no match
From
Dorothy Bridges <[email protected]>
To
[email protected]
Subject
Re: st: Reclink: high matching score, but no match
Date
Fri, 24 Jan 2014 10:43:51 -0800
Hello everyone (especially Devra and Michael): Was this ever resolved?
I'm having the exact same problem. Code and (partial) output copied
below. I usually use reclink without any problems.
reclink entidad municipio using
"ESDATA/dta/MunicipioLevel/ESDATAMun_Jan2014.dta", ///
idu(idu) idm(idm) gen(match) required(entidad)
listing the output:
entidad municipio
Umunicipio match
ZULIA VALMORE RODRIGUEZ SAN
FRANCISCO 0.9961
NUEVA ESPARTA ANTOLIN DEL CAMPO
SOTILLO 0.9961
YARACUY JOSE ANTONIO PAEZ
BRUZUAL 0.9961
ZULIA LA CANADA DE URDANETA FRANCISCO J
PULG 0.9974
ARAGUA MARIO BRICENO IRAGORRY M.OCUMARE D LA
COSTA 0.9977
ZULIA ROSARIO DE PERIJA
MARA 0.9982
ARAGUA FRANCISCO LINARES ALCANTARA FRANCISCO
LINARES A. 0.9992
BARINAS ALBERTO ARVELO TORREALBA
ZAMORA 0.9995
SUCRE RIBERO
MEJIA 1.0000
CARABOBO BEJUMA
SIFONTES 1.0000
MIRANDA URDANETA RIVAS
DAVILA 1.0000
YARACUY MANUEL MONGE
INDEPENDENCIA 1.0000
DELTA AMACURO CASACOIMA
TINACO 1.0000
SUCRE SUCRE
MONTES 1.0000
NUEVA ESPARTA GOMEZ
GARCIA 1.0000
ARAGUA JOSE ANGEL LAMAS JOSE ANGEL
LAMAS 1.0000
BARINAS CRUZ PAREDES
BARINAS 1.0000
MIRANDA SUCRE
RANGEL 1.0000
MIRANDA SIMON BOLIVAR PUEBLO
LLANO 1.0000
ZULIA PAEZ MACHIQUES
DE P 1.0000
On Wed, Dec 28, 2011 at 10:04 AM, Devra Golbe <[email protected]> wrote:
> Michael,
>
> student_name is non-numeric. After some additional data cleaning and the
> resulting reduction of the set that needed a fuzzy match reclink succeeded
> with student_name as the idusing variable, so my original problem is solved.
>
> But working with a smaller data set, I have an example where the non-numeric
> identifier and a numeric identifier fail, but a different numeric identifier
> succeeds. I'll send those data and the do-file to you off-list.
>
> Thanks and happy new year.
>
> Devra
>
>
> On 12/28/2011 11:49 AM Michael Blasnik wrote:
>>
>> It looks like this is a bug -- is student_name numeric? If not, you
>> may want to try encoding it and trying again. If that isn't the
>> problem, it might be best if you either send me the data or a trace
>> log off-list to see if i can figure it out, but I may not get a chance
>> to figure it out until after the holidays.
>>
>> Michael
>>
>> On Sat, Dec 24, 2011 at 4:10 PM, Devra Golbe<[email protected]> wrote:
>>>
>>> I am using Michael Blasnik's reclink (from SSC) to match records. I get
>>> extremely high matching scores, and yet the records do not match. Can
>>> anyone help? My code and relevant output are pasted below.
>>>
>>> Thanks and happy holidays,
>>> Devra
>>>
>>> ******
>>> . sort lname fname
>>> . gen idmaster=_n
>>> .tempfile ps1a
>>> .save `ps1a', replace
>>> . clear
>>> .use roster100f11Sep7.dta
>>> .sort lname fname
>>> .save, replace
>>> .clear
>>> .use `ps1a'
>>>
>>> .reclink lname fname using roster100f11Sep7.dta, ///
>>> idmaster(idmaster) idusing(student_name) gen(link)
>>>
>>> 0 perfect matches found
>>>
>>>
>>> Added: student_name= identifier from roster100f11Sep7.dta link =
>>> matching
>>> score
>>> Observations: Master N = 26 roster100f11Sep7.dta N= 182
>>> Unique Master Cases: matched = 0 (exact = 0), unmatched = 26
>>>
>>> .list link _merge in 1/5, clean
>>>
>>> link _merge
>>> 1. 0.9933 1
>>> 2. 0.9933 1
>>> 3. . 1
>>> 4. 0.6420 1
>>> 5. 0.9988 1
>>>
>>> _______
>>> Devra Golbe
>>> Professor of Economics
>>> Hunter College, CUNY
>>> NY, NY
>>> *
>>> *
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/