Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: Extract a letter between numbers
From
Nick Cox <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
st: RE: Extract a letter between numbers
Date
Mon, 22 Nov 2010 17:19:38 +0000
Here's one strategy. Another would exploit regex machinery.
Let's first suppose that you have -split- off the number field as the first word of the address.
You can loop over letters a-z (but what else might they type), numbers 0-9, or positions in the string. The last is going to be the shortest loop.
split address
gen number = ""
gen length = length(address1)
su length, meanonly
forval i = 1/`r(max)' {
replace number = number + substr(address1,`i',1) ///
if inrange(substr(address1,`i',1),"0","9")
}
assert number != ""
In words:
Initialise the number string at empty "".
Looking at each character {
Append the character if it is between "0" and "9"
}
See also, if desired,
SJ-6-4 dm0026 . . . . . . Stata tip 39: In a list or out? In a range or out?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
Q4/06 SJ 6(4):593--595 (no commands)
tip for use of inlist() and inrange()
Nick
[email protected]
Patrick McNamara
I'm new to stata coding (been using drop-down menus for a few years),
and I'm working on an address parser to pull apart and put back
together people's real address apart from the mess they enter online
:) Right now I'm trying to figure out a way to take out any letters in
between two numbers that people have accidentally typed into their
house address field (i.e. for 123 Main St, they types 12e3 Main St).
The letters are not in the same position and there are multiples. I've
tried strpos() but it won't allow me to use a range [A-Z] or [0-9].
Any help would be greatly appreciated!
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/