Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: identifying strings that differ on one or two letters
From
Nick Cox <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
RE: st: identifying strings that differ on one or two letters
Date
Fri, 19 Nov 2010 16:16:12 +0000
On -strgroup-, the pertiment information appears to be within the help file:
"strgroup is implemented as a plugin in order to minimize memory requirements and to maximize speed. Unfortunately, plugins are specific to the hardware
architecture and software framework of your computer, i.e., plugins are not cross-platform. Define a platform by two characteristics: machine type and operating
system. Stata stores these characteristics in c(machine_type) and c(os), respectively. strgroup supports the following platforms at this time:
Machine type Operating system
PC Windows
PC (64-bit x86-64) Unix
Macintosh MacOSX
Macintosh (Intel 64-bit) MacOSX"
The message appears to imply that your platform is not supported.
On -soundex()- evidently that function classifies more coarsely than you need.
These string matching problems are very difficult to automate in the sense of replicating what a knowledgeable human would do.
Nick
[email protected]
Dalhia
I tried both techniques suggested by the list (thank you Dmitry and Scott). But neither seem to work, and I am hoping you can tell me what is wrong.
I can't seem to load "strgroup." When I try to install it on stata 11, it gives me the following message:
"package does not contain strgroup.plugin for WIN64A platform could not load strgroup.pkg from http://fmwww.bc.edu/RePEc/bocode/s/"
I'm sure there is a simple fix, but my stata code knowledge is very basic, and I'm not sure how to fix this problem.
I also tried Soundex, but it identifies completely different companies as the same. For example, suniti commercials ltd, sunnytex investments pvt ltd, sunteck realty & infrastructure ltd, syndicate bank, all get the same soundex code S532. And soundex does not seem to allow any options that might limit matches to names that are very similar.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/