Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: extract string portion
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: extract string portion
Date
Tue, 27 Nov 2012 10:26:50 +0000
You allude to functions -regexm()- etc. There is no -regex- command.
Regular expressions are fine when they are the best tool, but plainer
string functions often work as well or better.
In your examples, naming your variable -myvar-,
gen lastpart = substr(myvar, strpos(myvar, ":") + 1, .)
would extract
EXC))
TTEC))
after which
replace lastpart = subinstr(lastpart, ")", "", .)
would remove any ")"
What could go wrong with this? Perhaps colons ":" could occur earlier.
In that case
gen lastpart = reverse(myvar)
replace lastpart = substr(lastpart, 1, strpos(lastpart, ":") - 1)
replace lastpart = subinstr(lastpart, ")", "", .)
replace lastpart = reverse(lastpart)
looks for the last colon, etc.
On Tue, Nov 27, 2012 at 10:14 AM, thomas bourveau
<[email protected]> wrote:
> I am working on a dataset where I need to extract an identifier for
> the parent company of my observations. Here are two examples:
>
> OnlineChoice.com, Inc. (Exelon Corporation (NYSE:EXC))
> Peppers & Rogers Group (TeleTech Holdings Inc. (NasdaqGS:TTEC))
>
> As you can see the dataset provides the name of the company, its stock
> exchange and its identifier (the Ticker) in the same row.
>
> I'm interested in retrieving the Ticker (EXC and TTEC in the
> abovementioned examples).
>
> I have read some things on using the regex command to extract figures
> from a string expression, but I did not found how to select only a
> string portion of it.
>
> If anyone has an idea, I will be grateful and try to implement it.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/