Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Devra Golbe <dgolbe@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Reclink: high matching score, but no match |
Date | Fri, 24 Jan 2014 17:45:45 -0500 |
Dorothy and all,My apologies for not closing this thread properly. Michael solved my problem in private correspondence, and I failed to report back to the list.
The problem was that in successive runs of my do-file I had managed to save the idusing variable to the master dataset. Once I dropped it from the master file reclink ran fine.
Devra On 1/24/2014 1:43 PM Dorothy Bridges wrote:
Hello everyone (especially Devra and Michael): Was this ever resolved? I'm having the exact same problem. Code and (partial) output copied below. I usually use reclink without any problems. reclink entidad municipio using "ESDATA/dta/MunicipioLevel/ESDATAMun_Jan2014.dta", /// idu(idu) idm(idm) gen(match) required(entidad) listing the output: entidad municipio Umunicipio match ZULIA VALMORE RODRIGUEZ SAN FRANCISCO 0.9961 NUEVA ESPARTA ANTOLIN DEL CAMPO SOTILLO 0.9961 YARACUY JOSE ANTONIO PAEZ BRUZUAL 0.9961 ZULIA LA CANADA DE URDANETA FRANCISCO J PULG 0.9974 ARAGUA MARIO BRICENO IRAGORRY M.OCUMARE D LA COSTA 0.9977 ZULIA ROSARIO DE PERIJA MARA 0.9982 ARAGUA FRANCISCO LINARES ALCANTARA FRANCISCO LINARES A. 0.9992 BARINAS ALBERTO ARVELO TORREALBA ZAMORA 0.9995 SUCRE RIBERO MEJIA 1.0000 CARABOBO BEJUMA SIFONTES 1.0000 MIRANDA URDANETA RIVAS DAVILA 1.0000 YARACUY MANUEL MONGE INDEPENDENCIA 1.0000 DELTA AMACURO CASACOIMA TINACO 1.0000 SUCRE SUCRE MONTES 1.0000 NUEVA ESPARTA GOMEZ GARCIA 1.0000 ARAGUA JOSE ANGEL LAMAS JOSE ANGEL LAMAS 1.0000 BARINAS CRUZ PAREDES BARINAS 1.0000 MIRANDA SUCRE RANGEL 1.0000 MIRANDA SIMON BOLIVAR PUEBLO LLANO 1.0000 ZULIA PAEZ MACHIQUES DE P 1.0000 On Wed, Dec 28, 2011 at 10:04 AM, Devra Golbe <dgolbe@gmail.com> wrote:Michael, student_name is non-numeric. After some additional data cleaning and the resulting reduction of the set that needed a fuzzy match reclink succeeded with student_name as the idusing variable, so my original problem is solved. But working with a smaller data set, I have an example where the non-numeric identifier and a numeric identifier fail, but a different numeric identifier succeeds. I'll send those data and the do-file to you off-list. Thanks and happy new year. Devra On 12/28/2011 11:49 AM Michael Blasnik wrote:It looks like this is a bug -- is student_name numeric? If not, you may want to try encoding it and trying again. If that isn't the problem, it might be best if you either send me the data or a trace log off-list to see if i can figure it out, but I may not get a chance to figure it out until after the holidays. Michael On Sat, Dec 24, 2011 at 4:10 PM, Devra Golbe<dgolbe@gmail.com> wrote:I am using Michael Blasnik's reclink (from SSC) to match records. I get extremely high matching scores, and yet the records do not match. Can anyone help? My code and relevant output are pasted below. Thanks and happy holidays, Devra ****** . sort lname fname . gen idmaster=_n .tempfile ps1a .save `ps1a', replace . clear .use roster100f11Sep7.dta .sort lname fname .save, replace .clear .use `ps1a' .reclink lname fname using roster100f11Sep7.dta, /// idmaster(idmaster) idusing(student_name) gen(link) 0 perfect matches found Added: student_name= identifier from roster100f11Sep7.dta link = matching score Observations: Master N = 26 roster100f11Sep7.dta N= 182 Unique Master Cases: matched = 0 (exact = 0), unmatched = 26 .list link _merge in 1/5, clean link _merge 1. 0.9933 1 2. 0.9933 1 3. . 1 4. 0.6420 1 5. 0.9988 1 _______ Devra Golbe Professor of Economics Hunter College, CUNY NY, NY * ** * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/
* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/