Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: how to identify strings among which some are abbreviated and group strings which have the same keywords

From	Nina <[email protected]>
To	statalist <[email protected]>
Subject	st: how to identify strings among which some are abbreviated and group strings which have the same keywords
Date	Wed, 9 Nov 2011 16:02:26 +0100

Dear all,

I have two questions to ask for your help.
The first one:
There is a string variable which defines applicant of patents in my dataset. I want to identify applicants uniquely, and I use -encode applicant, gen(firm)- to generate a numeric variable to identify them. However, for the same applicant, some of them are in full name and others are abbreviated. For example,

application number applicant
1 Mcneil consumer
2 Mcneil cons

when I use encode, two different identifiers are generated for the same applicant "mcneil consumer". Do you have any suggestions to deal with this case?

The second one:
The dataset is similar as the above one. And in this case, I want to generate a group id which assign one id for the applicants which is the subsidiaries of a company. For example, as shown in the following data, I want to generate a id which is equal to 1 for application 1&2 because the applicants are from "Mcneil"; while the id is equal to 2 for application 3&4 because they are from Mylan group.
application number applicant
1 MCNEIL PEDIATRICS
2 MCNEIL CONSUMER HEALTHCARE DIV MCNEIL PPC INC
3 MYLAN LABORATORIES INC
4 MYLAN PHARMACEUTICALS INC

Any suggestions and comments are more than welcome!
Thank you very much!

Best,
Nina

*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: how to identify strings among which some are abbreviated and group strings which have the same keywords
  - From: "Dimitriy V. Masterov" <[email protected]>
- st: RE: how to identify strings among which some are abbreviated and group strings which have the same keywords
  - From: Nick Cox <[email protected]>

Prev by Date: st: Choice among different heckprob models
Next by Date: st: RE: how to identify strings among which some are abbreviated and group strings which have the same keywords
Previous by thread: st: Choice among different heckprob models
Next by thread: st: RE: how to identify strings among which some are abbreviated and group strings which have the same keywords
Index(es):
- Date
- Thread