Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: gen new variable from string variables


From   "King, Belinda L" <Belinda.King2@mh.org.au>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: gen new variable from string variables
Date   Wed, 27 Apr 2005 08:59:46 +1000

Thank you to Eric Wruck, Rafal Raciborski, Joseph Coveney and Nick Cox for their help. All of the solutions were great and greatly appreciated!

Cheers, 
Bellinda


-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of Nick Cox
Sent: Tuesday, 26 April 2005 11:20 PM
To: statalist@hsphsun2.harvard.edu
Subject: RE: st: gen new variable from string variables


Eric Wruck, Rafal Raciborski and Joseph Coveney made 
overlapping suggestions that all solved the problem, 
so there's almost nothing left to be said, but I'll
say it anyway. 

For the general problem, the main ideas that make 
things easier are 

* looping over variables with -foreach- 

* using -substr()- or -index()- to look for substrings.

My main comment is to emphasise a point tacit in most 
answers: you can exploit Stata's treatment of true and 
false directly in such problems, and not just indirectly. 

For example, 

gen OK = 0 

qui foreach v of var icd* { 
	replace OK = OK + index(`v', "E11") 
} 

keep if OK 

works because any non-zero sum in -OK- 
is treated as true. Similarly, with 
the innermost statement instead 

	replace OK = OK + (substr(`v',1,3) == "E11") 

the parenthesised expression evaluates
to 0 or 1, and so the same trick will 
work: the criterion is whether the sum is 
non-zero (here, positive) or not. 

The "or" operator | naturally leads to 
the same conclusion. 

It is largely a matter of taste whether 
you do it this way or using lots of -if- 
conditions. The -if- route perhaps is
closer to the way people actually 
think about the question, but doing 
it this way has a habit of growing on you. 

There is background at 

What is true and false in Stata? 
http://www.stata.com/support/faqs/data/trueorfalse.html
 
King, Belinda L

> > I am wanting to create a variable that tells me whether the 
> patient has an
> > ICD10 starting with E11. I am not interested in the numbers 
> after the dots,
> > nor am I interested in any of the other ICD10 codes and I 
> am wanting to drop
> > patients who do not have E11 in one of the ICD10 columns. I 
> have tried
> > playing with foreach, but this side of things is new for me 
> and I just keep
> > getting messages telling me the syntax is incorrect. I would greatly
> > appreciate any hints someone could give me, thank you in advance.
> >
> >      +----------------------------------+
> >      | id   icd10_1   icd10_2   icd10_3 |
> >      |----------------------------------|
> >   1. |  1     K61.3    Z86.43     F05.9 |
> >   2. |  2     B95.8     Z06.2    Z86.43 |
> >   3. |  3    E11.69     R40.2     E11.9 |
> >   4. |  4    Z86.43     Z95.8     E87.6 |
> >   5. |  5     K59.0     K59.0     Z93.1 |
> >      |----------------------------------|
> >   6. |  6    E11.65    E11.66     R63.4 |
> >   7. |  7    E11.22     E66.9    E11.23 |
> >   8. |  8     E11.9     E78.0    K63.50 |
> >   9. |  9     E78.0    E11.65     D50.9 |
> >  10. | 10    E11.65     K59.0     Z93.0 |
> >      +----------------------------------+
> >

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index