Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Re: problem with split command
From
Prakash Singh <[email protected]>
To
[email protected]
Subject
Re: st: Re: problem with split command
Date
Wed, 29 Feb 2012 15:30:38 +0530
Thanks again
This is what I did after Joseph suggestion.
split state_name, p(2) gen(statename)
drop statename2
gen year = substr(state_name, -4, 4)
Prakash
On Wed, Feb 29, 2012 at 2:50 PM, Nick Cox <[email protected]> wrote:
> Joseph is naturally right. In addition,
>
> 1. The help for -split- gives an example in which parsing is on ")"
> but it is desired to keep the ")" and the answer is simply that if you
> use -split- in this way you must put them back yourself. This is
> similar to your problem.
>
> 2. The main point is that -split- is not designed directly for this
> kind of problem because when it was introduced there were already
> several ways to use existing string functions [N.B., not commands] to
> solve that kind of problem easily. Joseph has mentioned one. Here's
> another
>
> gen numeral = real(substr(state_name, -4, 4))
> gen state = substr(state_name, 1, length(state_name) - 4)
>
> Once -numeral- exists,
>
> gen state = subinstr(state_name, numeral, "", .)
>
> is another way to do it.
>
> Here's another
>
> gen numeral = substr(state_name, strpos(state_name, "2"), .)
> gen state = substr(state_name, 1, strpos(state_name, "2") - 1)
>
> Nick
>
> On Wed, Feb 29, 2012 at 3:49 AM, Joseph Coveney <[email protected]> wrote:
>
>> Forgot to mention: for this year's survey and afterward, try the alternative below. You can use Stata's regular expressions, too.
>>
>>
>> . input str30 state_name
>>
>> state_name
>> 1. "Andhra2012"
>> 2. "Arunachal2012"
>> 3. "Assam2012"
>> 4. "Bihar2012"
>> 5. "UttarPradesh2012"
>> 6. end
>>
>> .
>> . generate byte first_numeral = indexnot(state_name, "`c(alpha)'`c(ALPHA)'")
>>
>> . generate long year = real(substr(state_name, first_numeral, .))
>>
>> . replace state_name = substr(state_name, 1, first_numeral - 1)
>> (5 real changes made)
>>
>> .
>> . list, noobs separator(0) abbreviate(20)
>>
>> +-------------------------------------+
>> | state_name first_numeral year |
>> |-------------------------------------|
>> | Andhra 7 2012 |
>> | Arunachal 10 2012 |
>> | Assam 6 2012 |
>> | Bihar 6 2012 |
>> | UttarPradesh 13 2012 |
>> +-------------------------------------+
>>
>> .
>> . exit
>>
>> end of do-file
>
> Joseph Coveney
>
> You're almost there: finish the job by concatenating "2" and statename2:
>
> generate int year = real("2" + statename2)
>
>
> Prakash Singh wrote:
>
> I need help on using -split- command. I am working with Stata 10.
> I am working with survey data of Indian states, In the survey data the
> variable state_name are put jointly with year in which the state is
> surveyed, in this case 2005 to 2009. So the state_name variable looks
> like...
> Andhra2006
> Arunachal2005
> Assam2006
> Bihar2007
> UttarPradesh2009
>
> and so on.
> Now I would like to create two separate variables out of it i.e.
> state_name and year_survey.
>
> I have used the following command
> split state_name, pares(2) gen(statename)
>
> But the problem I am facing is the statename2 variable which is
> actually year variable is coming without 2 i.e. 005, 006 etc.
>
> Please suggest me as I have read the -split- help and Statalist postings
> on -split- but could not work it out.
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/