Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Reclink: high matching score, but no match


From   Dorothy Bridges <[email protected]>
To   [email protected]
Subject   Re: st: Reclink: high matching score, but no match
Date   Fri, 24 Jan 2014 10:43:51 -0800

Hello everyone (especially Devra and Michael): Was this ever resolved?
I'm having the exact same problem. Code and (partial) output copied
below. I usually use reclink without any problems.

reclink entidad municipio using
"ESDATA/dta/MunicipioLevel/ESDATAMun_Jan2014.dta", ///

        idu(idu) idm(idm) gen(match) required(entidad)

listing the output:

entidad                    municipio
Umunicipio              match
               ZULIA               VALMORE RODRIGUEZ          SAN
FRANCISCO   0.9961
       NUEVA ESPARTA               ANTOLIN DEL CAMPO
SOTILLO   0.9961
             YARACUY               JOSE ANTONIO PAEZ
BRUZUAL   0.9961
               ZULIA           LA CANADA DE URDANETA       FRANCISCO J
PULG   0.9974
              ARAGUA          MARIO BRICENO IRAGORRY   M.OCUMARE D LA
COSTA   0.9977
               ZULIA               ROSARIO DE PERIJA
MARA   0.9982
              ARAGUA     FRANCISCO LINARES ALCANTARA   FRANCISCO
LINARES A.   0.9992
             BARINAS        ALBERTO ARVELO TORREALBA
ZAMORA   0.9995
               SUCRE                          RIBERO
MEJIA   1.0000
            CARABOBO                          BEJUMA
SIFONTES   1.0000
             MIRANDA                        URDANETA           RIVAS
DAVILA   1.0000
             YARACUY                    MANUEL MONGE
INDEPENDENCIA   1.0000
       DELTA AMACURO                       CASACOIMA
TINACO   1.0000
               SUCRE                           SUCRE
MONTES   1.0000
       NUEVA ESPARTA                           GOMEZ
GARCIA   1.0000
              ARAGUA                JOSE ANGEL LAMAS       JOSE ANGEL
LAMAS   1.0000
             BARINAS                    CRUZ PAREDES
BARINAS   1.0000
             MIRANDA                           SUCRE
RANGEL   1.0000
             MIRANDA                   SIMON BOLIVAR           PUEBLO
LLANO   1.0000
               ZULIA                            PAEZ         MACHIQUES
DE P   1.0000

On Wed, Dec 28, 2011 at 10:04 AM, Devra Golbe <[email protected]> wrote:
> Michael,
>
> student_name is non-numeric.  After some additional data cleaning and the
> resulting reduction of the set that needed a fuzzy match  reclink succeeded
> with student_name as the idusing variable, so my original problem is solved.
>
> But working with a smaller data set, I have an example where the non-numeric
> identifier and a numeric identifier fail, but a different numeric identifier
> succeeds.  I'll send those data and the do-file to you off-list.
>
> Thanks and happy new year.
>
> Devra
>
>
> On 12/28/2011 11:49 AM Michael Blasnik wrote:
>>
>> It looks like this is a bug -- is student_name numeric?  If not, you
>> may want to try encoding it and trying again.  If that isn't the
>> problem, it might be best if you either send me the data or a trace
>> log off-list to see if i can figure it out, but I may not get a chance
>> to figure it out until after the holidays.
>>
>> Michael
>>
>> On Sat, Dec 24, 2011 at 4:10 PM, Devra Golbe<[email protected]>  wrote:
>>>
>>> I am using  Michael Blasnik's reclink (from SSC) to match records.  I get
>>> extremely high matching scores, and yet the records do not match.  Can
>>> anyone help?    My code and relevant output are pasted below.
>>>
>>> Thanks and happy holidays,
>>> Devra
>>>
>>> ******
>>> . sort lname fname
>>>   . gen idmaster=_n
>>>   .tempfile ps1a
>>>   .save `ps1a', replace
>>>   . clear
>>>   .use roster100f11Sep7.dta
>>>   .sort lname fname
>>>   .save, replace
>>>   .clear
>>>   .use `ps1a'
>>>
>>>   .reclink lname fname using roster100f11Sep7.dta, ///
>>>    idmaster(idmaster) idusing(student_name) gen(link)
>>>
>>> 0 perfect matches found
>>>
>>>
>>> Added: student_name= identifier from roster100f11Sep7.dta   link =
>>> matching
>>> score
>>> Observations:  Master N = 26    roster100f11Sep7.dta N= 182
>>>   Unique Master Cases: matched = 0 (exact = 0), unmatched = 26
>>>
>>> .list link _merge in 1/5, clean
>>>
>>>          link   _merge
>>>   1.   0.9933        1
>>>   2.   0.9933        1
>>>   3.        .        1
>>>   4.   0.6420        1
>>>   5.   0.9988        1
>>>
>>> _______
>>> Devra Golbe
>>> Professor of Economics
>>> Hunter College, CUNY
>>> NY, NY
>>> *
>>> *
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index