Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Extracting substrings from variables.
From
Amal Khanolkar <[email protected]>
To
"[email protected]" <[email protected]>
Subject
RE: st: Extracting substrings from variables.
Date
Fri, 25 May 2012 14:26:01 +0000
Hi again,
It works now! I forgot to specify the '=1' in the gen command.
However doing this the two ways (using gen with inlist and the regexs commands) I get slightly different numbers which shouldn't be the case....
. gen ht=1 if inlist(substr(mdiag1, 1, 3), "637", "642") | substr(mdiag1,1, 2) == "O1"
(2951413 missing values generated)
. tab ht
ht | Freq. Percent Cum.
------------+-----------------------------------
1 | 40,043 100.00 100.00
------------+-----------------------------------
Total | 40,043 100.00
. gen preght1 = regexs(0) if regexm(mdiag1, "^637|642|O1")
. tab preght1
preght1 | Freq. Percent Cum.
------------+-----------------------------------
637 | 8,314 20.62 20.62
642 | 21,537 53.42 74.05
O1 | 10,462 25.95 100.00
------------+-----------------------------------
Total | 40,313 100.00
Both ht & preght are the same variables above (or atleast should be the same - not sure what's causing the difference of 270!)
I also tried to combine/merge the many variables of preght I created all including the same diagnostic codes but from different time periods (named preght 1, preght2, 3, 4, 5, 6 etc....) using the egen command with the concat function - but it doesn't give me the right numbers - any other command that would do the job better?
/Amal.
______
From: [email protected] [[email protected]] on behalf of Nick Cox [[email protected]]
Sent: 25 May 2012 16:04
To: [email protected]
Subject: Re: st: Extracting substrings from variables.
Yes; my idea is that one of your parentheses ( or ) was missing! I've
rechecked my example and it looks OK.
if inlist(substr(m1diagx, 1, 3), "637", "642") | substr(m1diagx,
1, 2) == "O1"
Stata is just like elementary algebra: parentheses () brackets [] and
braces { } must all occur in pairs. You don't show us your code, and
so you need to count for yourself.
Nick
On Fri, May 25, 2012 at 2:31 PM, Amal Khanolkar <[email protected]> wrote:
> Thanks Brendan - it worked like a charm! :)
>
> Nick - I tried your way using 'inlist' however I kept getting an error message that one bracket was missing - I tried several ways to try and solve the issue - but was unable to do so - any ideas?
>
> I agree with both of you - regexs can be annoying esp for me who came across it for the first time today :)
>
>
> Thanks!
>
> /Amal.
>
>
> ________________________________________
> From: [email protected] [[email protected]] on behalf of Brendan Halpin [[email protected]]
> Sent: 25 May 2012 14:07
> To: [email protected]
> Subject: Re: st: Extracting substrings from variables.
>
> On Fri, May 25 2012, Brendan Halpin wrote:
>
>> On Fri, May 25 2012, Amal Khanolkar wrote:
>>
>>> gen preght = regexs(0) if regexm(mdiag1x, "[^637] | [^642] | [^O1]")
>>
>> A quick and untested suggestion:
>>
>> . gen preght = regexs(0) if regexm(mdiag1x, "^(637)|(642)|(O1)")
>
> On testing, it seems the grouping parentheses are not necessary:
>
> ...................................................................
> . input str10 mdiag1x
>
> mdiag1x
> 1. "637 asdf"
> 2. "638 asdf"
> 3. "8637 asdf"
> 4. "642 asdf"
> 5. "O1 asdf"
> 6. end
>
> . gen preght = regexs(0) if regexm(mdiag1x, "^637|642|O1")
> (2 missing values generated)
>
> . gen hasdiag = regexm(mdiag1x, "^637|642|O1")
>
> . list
>
> +------------------------------+
> | mdiag1x preght hasdiag |
> |------------------------------|
> 1. | 637 asdf 637 1 |
> 2. | 638 asdf 0 |
> 3. | 8637 asdf 0 |
> 4. | 642 asdf 642 1 |
> 5. | O1 asdf O1 1 |
> +------------------------------+
> ...................................................................
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/