[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Regexr Stata

From	"Frank de Libero" <[email protected]>
To	<[email protected]>
Subject	st: RE: Regexr Stata
Date	Thu, 25 Sep 2008 22:27:05 -0700

<>
Diana:

Building on Matt Spittal's response, the function _regexs_ gives you
additional flexibility. For example,

input str15 name
"A.I.F. GMBH"
"A.I.F. COMPANY"
"QANTAS A.I.F."
"AIR NEW ZEALAND"
"Aero Dienst"
"Aer Arann"
end

generate str1 group_name = ""

replace group_name = regexs(1) if regexm(name,"[A-Za-z]*(A.I.F.)")
replace group_name = regexs(1) if regexm(name,"[A-Za-z]*(AIR)")
replace group_name = regexs(1) if regexm(name,"[A-Za-z]*(Aero*)")

list
     +----------------------------+
     |            name   group_~e |
     |----------------------------|
  1. |     A.I.F. GMBH     A.I.F. |
  2. |  A.I.F. COMPANY     A.I.F. |
  3. |   QANTAS A.I.F.     A.I.F. |
  4. | AIR NEW ZEALAND        AIR |
  5. |     Aero Dienst       Aero |
  6. |       Aer Arann        Aer |
     +----------------------------+

Where the  *  in [A-Za-z]* and in (Aero*) means zero of more occurrences of
the previous character. The _replace_ commands could be incorporated in a
_foreach_ or _forvalues_ loop, or written as nested _cond_ functions.
Capitalization and punctuation could also be controlled for. But the above
is the basic idea.

..Frank

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Diana Eastman
Sent: Thursday, September 25, 2008 2:36 PM
To: [email protected]
Subject: st: Regexr Stata

Hi all, 

I have a variable called "name" which lists several different airlines.
I need to write some code that will identify a regular expression within
these names and assign them a value in the new variable "group_name"

For instance, for the two names:

A.I.F. GMBH
A.I.F. COMPANY

I would want the group_name to be only "A.I.F." (the part of the string
they both share). The identifying string does not occur in the same
position across the names. 

Any help is greatly appreciated. 



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Re: label
  - From: Christopher Baum <[email protected]>
- st: RE: Re: label (automatically)
  - From: "Feiveson, Alan H. (JSC-SK311)" <[email protected]>
- Re: st: RE: Re: label (automatically)
  - From: Nick Cox <[email protected]>
- RE: st: RE: Re: label (automatically)
  - From: "Feiveson, Alan H. (JSC-SK311)" <[email protected]>
- st: Regexr Stata
  - From: Diana Eastman <[email protected]>

Prev by Date: st: RE: what does Stata do when I type "var1==var2==0"?
Next by Date: st: RE: -xtlogit- produces different results under different versions
Previous by thread: Re: st: RE: RE: Regexr Stata
Next by thread: Re: st: RE: Re: label (automatically)
Index(es):
- Date
- Thread