Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: How do I split a string variable without spaces by capital letters?


From   "Eric A. Booth" <[email protected]>
To   [email protected]
Subject   Re: st: How do I split a string variable without spaces by capital letters?
Date   Mon, 19 Aug 2013 10:36:13 -0500

<>
Agreed, -moss- is great for this, but also you can do this using
built-in string functions if you are interested, example:

*****************!
clear all
inp str13(v1)
"TestOne"
"ThisistestTwo"
"AndThree"
end

g v2 = reverse(v1)
g pos = .
g l = length(v1)
foreach x in `c(ALPHA)' {
   replace pos = strpos(v2, "`x'") if inlist(pos, ., 0, l)
  }
drop v2
g first = substr(v1, 1, l-pos)
g second = substr(v1, l-pos+1, l)
list
*****************!
EAB



On Mon, Aug 19, 2013 at 10:31 AM, Robert Picard <[email protected]> wrote:
> You can use -moss- (available from SSC) to handle this problem. The
> following works with your example:
>
> moss v1, match("([A-Z][^A-Z]*)") regex
>
> The pattern indicates that you are looking for substrings that start
> with a capital letter (i.e [A-Z]) followed by zero or more non-capital
> letters (i.e. [^A-Z]*).
>
> On Mon, Aug 19, 2013 at 10:06 AM, Andrew Dickens <[email protected]> wrote:
>> Hi all,
>>
>> I'm currently running Stata 10, and I'm having a problem splitting a string
>> variable by capital letters. Elena Vidal posted something under a similar
>> title, http://www.stata.com/statalist/archive/2011-11/msg01195.html, but the
>> her problem is somewhat different than mine and I was unable to
>> troubleshoot.
>>
>> An example of my data is as follows:
>>
>> clear all
>> inp str13(v1)
>> "TestOne"
>> "ThisistestTwo"
>> "AndThree"
>> end
>>
>> The problem is the capital letter I wish to split each cell by is not
>> consistently placed.
>>
>> I tried splitting using this code:
>>
>> split v1, p(upper(a-z))
>> or
>> split v1, p(upper(.))
>>
>> but this just generates an identical variable to v1.
>>
>> What I would like to do is create two new variables, so the first
>> observation of my example would have "Test" in the first new variable and
>> "One" in the second new variable. Suggestions would be greatly appreciated.
>>
>> Thank you for your consideration.
>>
>> Andrew
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index