Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Using regex to identify strings with capital letters

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	st: RE: Using regex to identify strings with capital letters
Date	Wed, 26 May 2010 19:07:54 +0100

You don't need regex for this. 

... if inrange(substr(myvar,1,2), "AA", "ZZ") 

should be enough, or even "AK" to "WY" or whatever it is. (Remember this
is an international list!) 

Nick 
[email protected] 

Beecroft, Erik (VDSS)

I need to extract certain observations from a series of text files.
Each file contains only one variable, which is string.  The
observations I want all begin with two capital letters. (They are state
abbreviations, such as VA or AK).  The other observations do not begin
with two capital letters.

Is there a way to tell Stata to keep only observations for which the
variable begins with two capital letters?

It seems like the regex function might work, but I have never worked
with regular expression syntax before.  

For example, a portion of a text file might look like:
	text1
	text2
	VA department of Social Services
	text4
	text5

I want to keep only the third observation above.

I am using Stata for Windows 10.1.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Using regex to identify strings with capital letters
  - From: "Beecroft, Erik (VDSS)" <[email protected]>

Prev by Date: st: Using regex to identify strings with capital letters
Next by Date: Re: st: RE: pattern-fills in stacked bar graph
Previous by thread: st: Using regex to identify strings with capital letters
Next by thread: st: Using regex to identify strings with capital letters
Index(es):
- Date
- Thread