Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: puzzling string conversion
From
Dimitri Szerman <[email protected]>
To
statalist <[email protected]>
Subject
st: puzzling string conversion
Date
Thu, 10 Feb 2011 14:57:32 +0000
Hi again,
I got this puzzling result. I have a string variable, mystring, which
has both numeric and non-numeric characters. I'd like to extract only
the numeric ones, and form a numeric variable with this (in fact, it's
going to be an id). I'm using regular expressions, and this is what
I'm doing
input str30 mystring
"111.aaa.22.2/33-33"
"011.xyz.22.2/33-33"
"101.abc.22.2/33-33"
"222.foo.22.2/33-33"
"111.bla.22.2/33-33"
end
gen id = mystring
while regexm(id, "[^0-9]" ) {
replace id = regexr(id,"[^0-9]","")
}
destring id, gen(numid)
And it works fine. However, if mystring has an observation which
contains very few (when compared to the other observations)
non-numeric characters, this seems to break down:
clear
input str30 mystring
"A"
"011.xyz.22.2/33-33"
"101.abc.22.2/33-33"
"222.foo.22.2/33-33"
"111.bla.22.2/33-33"
end
gen id = mystring
while regexm(id, "[^0-9]" ) {
replace id = regexr(id,"[^0-9]","")
}
destring id, gen(numid)
Am I missing something? Why doesn't this work? Any suggestions?
Thanks,
Dimitri
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/