Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Extracting substrings from variable and combining variables.
From
Nick Cox <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
RE: st: Extracting substrings from variable and combining variables.
Date
Thu, 31 May 2012 16:34:58 +0100
-egen, concat()- "didn't work": this can not be discussed without reference to exactly (a) what you want to do, (b) what you tried and (c) what happened.
Nick
[email protected]
Amal Khanolkar
Hi Nick & Brendan,
Thanks so much for your help with the 'regex' commands in retrieving subjects with a common diagnosis from my dataset.
I know have 12 such 'diagnostic' variables (preght1-12) all for say hypertension ( 12, as a patient might have received this diagnosis as the 1st or 7th or 12th diagnosis when admitted to hospital).
I need to combine these 12 variables into one. I tried doing this using the 'egen' command with the concat function but it didn't work. Any tips on other commands I could try?
The variables look like this and most of the 12 variables have the same 3 categories, but some have just 2 or 1:
tab preght1
preght1 | Freq. Percent Cum.
------------+-----------------------------------
637 | 8,314 20.76 20.76
642 | 21,268 53.11 73.88
O1 | 10,461 26.12 100.00
------------+-----------------------------------
Total | 40,043 100.00
. tab preght2
preght2 | Freq. Percent Cum.
------------+-----------------------------------
637 | 11,202 33.51 33.51
642 | 15,191 45.44 78.95
O1 | 7,036 21.05 100.00
------------+-----------------------------------
Total | 33,429 100.00
. tab preght4
preght4 | Freq. Percent Cum.
------------+-----------------------------------
637 | 797 18.02 18.02
642 | 1,747 39.51 57.53
O1 | 1,878 42.47 100.00
------------+-----------------------------------
Total | 4,422 100.00
. des preght1
storage display value
variable name type format label variable label
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
preght1 str3 %9s
Thanks,
/Amal.
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
From: [email protected] [[email protected]] on behalf of Nick Cox [[email protected]]
Sent: 25 May 2012 20:22
To: [email protected]
Subject: Re: st: Extracting substrings from variables.
As any leading spaces surely don't matter, consider using
regexm(ltrim(mdiag1x), "^(637|642|O1)")
Nick
On Fri, May 25, 2012 at 7:17 PM, Brendan Halpin <[email protected]> wrote:
> On Fri, May 25 2012, Nick Cox wrote:
>
>> . di regexm("Stata rules OK O1", "^637|642|O1")
>> 1
>
> OK, I was wrong that the grouping parentheses were unnecessary. However,
> the way I used them first was also wrong.
>
> Something like this is needed:
>
> . gen pright = regexs(0) if regexm(mdiag1x, "^(637|642|O1)")
>
> More evidence that Nick's reluctance about regexp is not unwise.
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/