Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: regular expression or some simpler data extraction method
From
"Ben Hoen" <[email protected]>
To
<[email protected]>
Subject
RE: st: regular expression or some simpler data extraction method
Date
Wed, 16 Nov 2011 15:27:25 -0500
Thanks again Mathew & Brendan.
I realized that I had changed the variable name in the meantime to
"phase_description", which was causing the type mismatch error.
This syntax worked great!
gen vi_tnum = regexs(1) if regexm(phase_description, "([0-9]+) WT$")
Ben Hoen
LBNL
Office: 845-758-1896
Cell: 718-812-7589
-----Original Message-----
From: Ben Hoen [mailto:[email protected]]
Sent: Wednesday, November 16, 2011 3:22 PM
To: [email protected]
Subject: Re: st: regular expression or some simpler data extraction method
Thanks Mathew.
That didn't seem to work. I am getting a "type mismatch" error.
Based on your first response I also tried:
gen vi_tnum = regexs(1) if regexm(phase, "[\, ]?([0-9]+) WT$")
and
gen vi_tnum = regexs(1) if regexm(phase, "[\, ]?([0-9]+)[ WT]$")
and got the same "type mismatch" error, so maybe they are related.
I tried these because WT is always the end of the string, therefore any
comma would necessarily precede the digits and the WT. Maybe that was not
clear originally.
Ben
Ben Hoen
Principal Research Associate
Lawrence Berkeley National Laboratory
Office: 845-758-1896
Cell: 718-812-7589
[email protected]
http://eetd.lbl.gov/ea/emp/staff/hoen.html
Re: st: regular expression or some simpler data extraction method
________________________________________
From
Matthew White <[email protected]>
To
[email protected]
Subject
Re: st: regular expression or some simpler data extraction method
Date
Wed, 16 Nov 2011 15:01:27 -0500
________________________________________
Hi Ben,
Scratch that; the "[ ,]?" isn't a good idea. The following should work
as long as there aren't codes other than "WT" that start with "WT":
gen vi_tnum = substr(regexs(0), 1, strpos(regexs(0), " ") - 1) if
regexm(phase, "[0-9]+ WT[ ,]?")
destring vi_tnum, replace
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/