Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Regular expressions
From
Joe Canner <[email protected]>
To
"[email protected]" <[email protected]>
Subject
RE: st: Regular expressions
Date
Fri, 7 Mar 2014 13:41:05 +0000
Good point about the first digit :)
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
Sent: Friday, March 07, 2014 8:38 AM
To: [email protected]
Subject: Re: st: Regular expressions
Unsurprisingly, this is almost identical to Joe's
I feel confident that the first digit must be 1 or 2.
Nick
[email protected]
On 7 March 2014 13:35, Nick Cox <[email protected]> wrote:
> clear
> set obs 1
> gen test = "Robin Hood (2000)"
> gen test2 = trim(regexr(test, "(\([1-2][0-9][0-9][0-9]\))", ""))
> list
> Nick
> [email protected]
>
>
> On 7 March 2014 13:28, Marco Savegnago <[email protected]> wrote:
>> Dear all,
>> as regard point 1) this might work:
>>
>> gen movie2 = rtrim(substr(movie, 1, index(movie, "(") - 1))
>>
>> I thinks it works as long as the title of the movie does not contain
>> other round brackets except those for the year.
>>
>> What do you think?
>> best,
>> Marco
>>
>> 2014-03-07 12:49 GMT+01:00 Nick Cox <[email protected]>:
>>> Your second problem sounds like for -split-. I wouldn't reach for
>>> regular expressions there.
>>> Nick
>>> [email protected]
>>>
>>>
>>> On 7 March 2014 11:40, Estrella Gomez <[email protected]> wrote:
>>>> Hi,
>>>>
>>>> I would like to do two modifications to two string variables using
>>>> regular expression:
>>>>
>>>> 1) I have a list of movie titles with a year included; for instance:
>>>> "Robin Hood (2010)". I would like to drop the years and the
>>>> parenthesis, so the final value should be "Robin Hood". The number of
>>>> words in the title varies a lot across movies
>>>>
>>>> 2) I have a variable indicating where the movie was produced. In some
>>>> cases there are several countries, for instance "UK, Germany, Canada,
>>>> Switzerland". I would like to generate one variable per country (1st
>>>> variable take value UK, 2nd Germany and so on). Again, the number of
>>>> countries per movie is not fixed; it varies from 1 to 4
>>>>
>>>> Any suggestion?
>>>>
>>>> Thanks a lot,
>>>> Estrella
>>>> *
>>>> * For searches and help try:
>>>> * http://www.stata.com/help.cgi?search
>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> * http://www.ats.ucla.edu/stat/stata/
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/