On Fri, Sep 26, 2008 at 9:08 AM, Diana Eastman <[email protected]> wrote:
>
> Thank you for the responses. This is incredibly helpful.
>
> -----Original Message-----
> From: [email protected] on behalf of Matt Spittal
> Sent: Thu 9/25/2008 10:18 PM
> To: [email protected]
> Subject: st: RE: Regexr Stata
>
> Diana,
>
> One way of searching a string for a certain match (regardless of its position) is to do the following
>
> generate str15 grp = "A.I.F" if regexm(name, "A.I.F")
the same can alse be achieved with a simple strpos() function:
generate str15 grp = "A.I.F" if strpos(name, "A.I.F")
since regular expression is hardly used in the above.
Regards, Sergiy Radyakin
>
> This will create a string variable called 'grp' which will equal A.I.F if A.I.F appeared anywhere within the variable 'name'. This works because -regexm(name, "A.I.F")- returns 1 if the statement if true and 0 if it is false. So Stata will create a variable called 'grp' and assign it the value A.I.F if the statement is true and missing if it is not.
>
> As an extension to this, if your data looks like this
>
> A.I.F. GMBH
> A.I.F. COMPANY
> QANTAS
> AIR NEW ZEALAND
>
> and you also want to identify, say, QANTAS flights, then you can add this line to your code
>
> replace grp = "QANTAS" if regexm(name, "QANTAS")
>
> You'll have to be careful if your data looks something like this
>
> A.I.F. GMBH
> A.I.F. COMPANY
> QANTAS A.I.F
> AIR NEW ZEALAND
>
> because the value A.I.F, which was created with the -generate- statement, will be replaced with QANTAS in the -replace- statement.
>
> -- Matt
> [email protected]
>
>
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On Behalf Of Diana Eastman
> Sent: Friday, 26 September 2008 7:36 AM
> To: [email protected]
> Subject: st: Regexr Stata
>
>
> Hi all,
>
> I have a variable called "name" which lists several different airlines.
> I need to write some code that will identify a regular expression within
> these names and assign them a value in the new variable "group_name"
>
> For instance, for the two names:
>
> A.I.F. GMBH
> A.I.F. COMPANY
>
> I would want the group_name to be only "A.I.F." (the part of the string
> they both share). The identifying string does not occur in the same
> position across the names.
>
> Any help is greatly appreciated.
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
>
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/