Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: How do I split a string variable without spaces by capital letters?
From
"Eric A. Booth" <[email protected]>
To
[email protected]
Subject
Re: st: How do I split a string variable without spaces by capital letters?
Date
Mon, 19 Aug 2013 10:36:13 -0500
<>
Agreed, -moss- is great for this, but also you can do this using
built-in string functions if you are interested, example:
*****************!
clear all
inp str13(v1)
"TestOne"
"ThisistestTwo"
"AndThree"
end
g v2 = reverse(v1)
g pos = .
g l = length(v1)
foreach x in `c(ALPHA)' {
replace pos = strpos(v2, "`x'") if inlist(pos, ., 0, l)
}
drop v2
g first = substr(v1, 1, l-pos)
g second = substr(v1, l-pos+1, l)
list
*****************!
EAB
On Mon, Aug 19, 2013 at 10:31 AM, Robert Picard <[email protected]> wrote:
> You can use -moss- (available from SSC) to handle this problem. The
> following works with your example:
>
> moss v1, match("([A-Z][^A-Z]*)") regex
>
> The pattern indicates that you are looking for substrings that start
> with a capital letter (i.e [A-Z]) followed by zero or more non-capital
> letters (i.e. [^A-Z]*).
>
> On Mon, Aug 19, 2013 at 10:06 AM, Andrew Dickens <[email protected]> wrote:
>> Hi all,
>>
>> I'm currently running Stata 10, and I'm having a problem splitting a string
>> variable by capital letters. Elena Vidal posted something under a similar
>> title, http://www.stata.com/statalist/archive/2011-11/msg01195.html, but the
>> her problem is somewhat different than mine and I was unable to
>> troubleshoot.
>>
>> An example of my data is as follows:
>>
>> clear all
>> inp str13(v1)
>> "TestOne"
>> "ThisistestTwo"
>> "AndThree"
>> end
>>
>> The problem is the capital letter I wish to split each cell by is not
>> consistently placed.
>>
>> I tried splitting using this code:
>>
>> split v1, p(upper(a-z))
>> or
>> split v1, p(upper(.))
>>
>> but this just generates an identical variable to v1.
>>
>> What I would like to do is create two new variables, so the first
>> observation of my example would have "Test" in the first new variable and
>> "One" in the second new variable. Suggestions would be greatly appreciated.
>>
>> Thank you for your consideration.
>>
>> Andrew
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/