Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Escaping left quote as argument of parse in split command
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Escaping left quote as argument of parse in split command
Date
Wed, 4 Apr 2012 09:23:54 +0100
I agree with Eric. This particular character is awkward for -split-
and a work-around like that shown in its help for the tab character
doesn't help.
A first-principles solution is also easy in this case:
. gen part1 = substr(id, 1, strpos(id, "`") - 1)
. gen part2 = substr(id, strpos(id, "`") + 1, .)
Nick
On Wed, Apr 4, 2012 at 5:29 AM, Eric Booth <[email protected]> wrote:
> I'm not sure how to get the parse option of -split- to accept the single left quote, but you can get around it by using a function like -subinstr()- to replace the left quote with something else and then -split-, so:
>
> **
> replace id = subinstr(x, "`", "@", .)
> split id, parse("@")
> **
>
>
> P.S.
> The part of the split.ado file that is choking is (using trace):
>
> - if `"`parse'"' == `""' | `"`parse'"' == `""""' {
> = if `""`""' == `""' | `""`""' == `""""' {
> { required
>
> In trying to escape the single left quote in -split-, I had a copy/paste error and accidentally ran:
>
> split id, parse("\`\`"'')
>
> which curiously gave the error:
>
> "no room to add more variables
> Up to 32,000 variables are currently allowed, although you could reset the maximum using set maxvar; see help
> memory.
> r(900); t=14.37 23:21:53"
>
>
> Since string functions, like subinstr(), can work with the single left quote, it seems like it should be possible that -split- could work with it as well (I say that without knowing anything about the internal mechanics of the string functions like subinstr())
On Apr 3, 2012, at 10:45 PM, Florian Kuhn wrote:
>> in my dataset, I have a string variable “id” in which the left single quote ` is used to separate a first and second part of the id (the creators of the dataset were clearly not using Stata). For example, a typical entry in the column "id" would be 15`32. I am trying to recover both parts of the id as separate variables using the "split" command.
>>
>> However, escaping the backtick does not seem to work:
>> split id, parse("\`")
>> gives the error message:
>> { required
>> r(100);
>>
>> Am I missing something obvious here?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/