Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Splitting string by parse option
From
Sebastian Say <[email protected]>
To
[email protected]
Subject
Re: st: Splitting string by parse option
Date
Wed, 5 Mar 2014 14:20:29 -0600
Thanks that was helpful. I guess in the raw data, for each cell, there
were multiple lines. These were created by the person pressing
Alt-Enter (i.e line breaks) after each entity.
I found a workaround. Basically using this formula in the raw excel data.
=SUBSTITUTE(I2,CHAR(13),""
This removes the line breaks and then the rest can be done with
stata's split command.
Sebastian.
On Wed, Mar 5, 2014 at 1:00 PM, Nick Cox <[email protected]> wrote:
> Depends what the "spaces" really are. As before, -charlist- (SSC)
> might tell you.
>
> See also the help for -split- on parsing on slightly awkward characters.
>
> Nick
> [email protected]
>
>
> On 5 March 2014 16:59, Sebastian Say <[email protected]> wrote:
>> Hi Nick, I have another observation. In the raw data excel format, I
>> noticed that the data looks like there are multiple lines in each
>> cell. See below
>>
>> frogXOOD_
>> toadXOOD_
>> tadpoleXOOD_
>>
>> When I deleted the spaces an made them: frogXOOD_toadXOOD_tadpoleXOOD_
>> , the split var, parse (XOOD_) command worked.
>>
>> Now I have to figure how I can get the data from the former to the
>> latter in a less tedious way.
>>
>> Any ideas?
>>
>>
>> Best,
>> Sebastian
>>
>> On Wed, Mar 5, 2014 at 3:23 AM, Nick Cox <[email protected]> wrote:
>>> Works for me
>>>
>>> . gen test = "frogXOOD_toadXOOD_newt"
>>>
>>> . split test, parse(XOOD_)
>>> variables created as string:
>>> test1 test2 test3
>>>
>>> . l test* in 1
>>>
>>> +------------------------------------------------+
>>> | test test1 test2 test3 |
>>> |------------------------------------------------|
>>> 1. | frogXOOD_toadXOOD_newt frog toad newt |
>>> +------------------------------------------------+
>>>
>>> Have you got strange characters in your variable? -charlist- (SSC) as
>>> recently discussed here is one diagnostic tool.
>>> Nick
>>> [email protected]
>>>
>>>
>>> On 5 March 2014 09:12, Sebastian Say <[email protected]> wrote:
>>>> Hi I have a string variable called cofiler, in which the cell contains
>>>> names of organizations all separated by "XOOD_".
>>>>
>>>> When i tried to use the split var, parse (XOOD_) option, it shows
>>>> cofiler1, cofiler2....cofiler42 generated.
>>>>
>>>> My guess is at least 1 cell has 42 names, each separated by XOOD_
>>>>
>>>> However, when I looked at these generated variables, it seems that
>>>> they are all empty.
>>>>
>>>> When I browsed the data, in the original string variable that was to
>>>> be splitted, it shows only 1 organization's name in any cell. However,
>>>> when I click onto each of these cells, I can see multiple names in the
>>>> display bar at the top.
>>>>
>>>> My question is, what's causing this problem? And what have I done
>>>> wrong such that when I tried to parse the string variable, nothing
>>>> shows up but variables were indeed generated?
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/