Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: extracting a specific portion of a string
From
Eric Booth <[email protected]>
To
"<[email protected]>" <[email protected]>
Subject
Re: st: RE: extracting a specific portion of a string
Date
Thu, 17 Mar 2011 14:15:36 +0000
<>
On Mar 17, 2011, at 9:10 AM, Travis Coan wrote:
> This is actually Jorge's string variable, not mine. Though, I have
> certainly enjoyed your and Nick's posts.
Ah yes, sorry about that Travis.
- Eric
__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
[email protected]
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Eric Booth
> Sent: Thursday, March 17, 2011 9:57 AM
> To: <[email protected]>
> Subject: Re: st: RE: extracting a specific portion of a string
>
> <>
> On Mar 17, 2011, at 5:42 AM, <[email protected]>
> wrote:
>
>> Wouldn't it be simpler to rename "MOTHER'SBLOOD" "BLOOD,MATERNAL"?
>>
>> reg
>>
>
> I doubt it. It's probably a safe bet that Travis's string variable
> takes more than just the 9 values he showed us in the example.
> If there were hundreds or thousands of values where "BLOOD" or "SERUM"
> were located later in the string, would you really want to write some
> form of "replace v1 = "BLOOD,MATERNAL" if v1== "MOTHER'SBLOOD" for every
> possible instance? Also, Travis may be interested in extracting more
> than just "BLOOD" AND "SERUM" (such as "LIPEMIC", "1ST", "SPECIMEN",
> etc.) which could become problematic if you start jumbling up the
> variable just to get "BLOOD" to the front of it. Better to leave the
> string variable in place and use string functions to flag observations
> that contain some substring of interest or extract substrings into other
> variables.
>
>
> On Mar 17, 2011, at 4:04 AM, Nick Cox wrote:
>> <snip>
>> On a detail that might confuse: Eric used -index()- and -strpos()-. In
>> essence, -index()- is the old name that still works, while -strpos()-
>> is the new name. It's the same function underneath the names.
>
> This is (at least) the second time Nick has been kind enough to remind
> me that strpos() is the modern version of index() -- old habits die
> hard. (http://www.stata.com/statalist/archive/2011-02/msg01111.html)
>
> - Eric
> __
> Eric A. Booth
> Public Policy Research Institute
> Texas A&M University
> [email protected]
> Office: +979.845.6754
>
>
>>
>> -----Original Message-----
>>> From: Eric Booth <[email protected]>
>>> Sent: Mar 17, 2011 12:15 AM
>>> To: "<[email protected]>"
> <[email protected]>
>>> Subject: Re: st: RE: extracting a specific portion of a string
>>>
>>> <>
>>>
>>> On Mar 16, 2011, at 10:43 PM, Travis Coan wrote:
>>>>
>>>> I would take a look at the -substr- function -- typing 'help substr'
> should get you there.
>>>>
>>>
>>> You should probably look at all the functions available in -help
> string_functions-.
>>> Note that -substr- alone wouldn't return the desired result in this
> example, e.g.:
>>>
>>> **********************!
>>> clear
>>> inp str20(v1)
>>> "BLOOD"
>>> "BLOOD(LIPEMIC)"
>>> "BLOOD(MODERATELYLY"
>>> "BLOOD, 2ND SPECIMEN"
>>> "BLOOD,1STSPECIMEN"
>>> "BLOOD,2NDSPECIMEN"
>>> "MOTHER'SBLOOD"
>>> "SERUM,1STSPECIMEN"
>>> "SERUM,2NDSPECIMEN"
>>> end
>>>
>>> g v2 = substr(v1, 1, 5)
>>> **note obs 7
>>>
>>> //using strpos and substr string functions//
>>> g str10 v4 = ""
>>> foreach x in "BLOOD" "SERUM" {
>>> g v`x' = strpos(v1, "`x'")
>>> replace v4 = substr(v1, v`x' , 5) if v`x'>0
>>> }
>>>
>>> //using index//
>>> g ind = 0
>>> replace ind = 1 if index(v1, "BLOOD")
>>> replace ind = 2 if index(v1, "SERUM")
>>> la def ii 1 "Blood" 2 "Serum", modify
>>> lab val ind ii
>>> li
>>> **********************!
>>>
>>> - Eric
>>> __
>>> Eric A. Booth
>>> Public Policy Research Institute
>>> Texas A&M University
>>> [email protected]
>>>
>>>
>>>>
>>>>
>>>> From: [email protected]
> [mailto:[email protected]] On Behalf Of Mendoza
> Aldana, Dr Jorge Antonio (WPRO)
>>>> Sent: Wednesday, March 16, 2011 10:36 PM
>>>> To: [email protected]
>>>> Subject: st: extracting a specific portion of a string
>>>>
>>>> Dear all,
>>>> My dataset has a string variable, from which I need a specific
> portion of it. The content of the variable is like:
>>>>
>>>> BLOOD
>>>> BLOOD(LIPEMIC)
>>>> BLOOD(MODERATELYLY
>>>> BLOOD, 2ND SPECIMEN
>>>> BLOOD,1STSPECIMEN
>>>> BLOOD,2NDSPECIMEN
>>>> MOTHER'SBLOOD
>>>> SERUM,1STSPECIMEN
>>>> SERUM,2NDSPECIMEN
>>>>
>>>> and I need to generate a new variable containing either "BLOOD" or
> "SERUM"
>>>> I would appreciate very much if you can give me some hints on
> solving this.
>>>> I'm using Stata 11.1 on a Windows XP machine
>>>> Kind regards,
>>>> Jorge
>>>>
>>>>
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/