Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: regular expression -split string an unknown number of times
From
"Rodini, Mark" <[email protected]>
To
<[email protected]>
Subject
st: regular expression -split string an unknown number of times
Date
Thu, 4 Aug 2011 10:54:21 -0700
Greetings,
I have a simple question. I have a list of strings representing names which lack any spaces and I'm trying to insert a space in the correct place or places to split out the names.
For example, I might have:
JohnPaulJones
Which I'd like to turn into
John Paul Jones
The rule is to insert a space before any upper case letter followed by a lower case.
gen teststring = regexs(1) if regexm(var,"^([A-Z][a-z]+)")
gives the first word. I think I could do the following to get John Paul
gen teststring = regexs(1) + " " + regexs(2) if regexm(var,"^([A-Z][a-z]+)([A-Z][a-z]+)")
The difficulty I'm having is that the number of subnames in a string is variable. The example above has three subnames, but I might have one with two or four, etc. I'm not sure how to program that.
Thanks for any help.
Mark
----------------------------------------------
Mark Rodini
COMPASS LEXECON
1111 Broadway, Suite 1500
Oakland, CA 94607
510-285-1258 (direct)
510-285-1240 (main)
510-285-1245 (fax)
[email protected]
This e-mail and attachments may be confidential and protected by legal privilege. If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the e-mail or any attachment is prohibited. If you have received this e-mail in error, please notify us immediately by replying to the sender, and then delete this copy and the reply from your system. Thank you for your cooperation.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/