Hi, Is there some kind of "sounds like" function in Stata? I have a list of companies but the names are sometimes a little different. Example AOL Time Warner also appears as AOL, Time Warner, and Time Warner Inc. I need a method to figure out that all these are the same entity, and none of the string functions in Stata seem to do what I want. Do any of you have any suggestions. Here is how the data looks like:
Name
AOL
AOL Time Warner
Time Warner Inc
Microsoft
Microsoft Inc
Microsft
Ideally, what I would like is some way to indicate which names are similar. For example:
Name, Similarity
AOL, 1
AOL Time Warner, 1
Time Warner Inc, 1
Microsoft, 2
Microsoft Inc, 2
Microsft, 2
Any help will be much appreciated.
Thanks
Dalhia
--- On Fri, 6/5/09, Nick Cox <[email protected]> wrote:
> From: Nick Cox <[email protected]>
> Subject: st: RE: appling string functions across observations
> To: [email protected]
> Date: Friday, June 5, 2009, 3:00 PM
> Check out -fndmtch2- or -fndmtch-
> from SSC. At first sight they don't
> address this problem, but there are at least two ways
> forward:
>
> It sounds as if you have surnames and full names (or the
> equivalent in
> your area). -split- the fullnames and work with the
> separate variables.
>
> Clone one of the programs above but modify the code to look
> for string
> inclusion rather than strict equality.
>
> Nick
> [email protected]
>
>
> Dalhia
>
> I have a list of two variables: name1 and name2. I
> need to check if
> name2 occurs in any of the name1s. The regexm command in
> stata is
> perfect for what I want to do, but it checks only one
> string at a time,
> and I need it to somehow rotate over a whole list of
> names.
>
> Here is what I have:
>
> name1
> ram solanki
> goel mehta
> ashish gupta
>
> name2
> solanki
> mehta
>
> I need to be able to figure out that "solanki" and "mehta"
> in name2
> occur in name1 observation1 and observation2.
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/