Title | Checking a variable for a range of ICD codes | |
Author | Rebecca Pope, StataCorp |
You can check whether a given variable has ICD-9-CM diagnosis codes, ICD-9-CM procedure codes, or ICD-10 diagnosis codes by using, respectively, the icd9, icd9p, or icd10 command with the generate subcommand and range() option.
For example, if you were analyzing ICD-9-CM diagnosis codes, you might have data that look like
recid dx1 dx2 dx3 84 4414 99811 4275 105 25013 3572 25063 255 51909 1489 V146 651 9678 E8528 696 V271 64421 65641 779 5409 V1582 V1062 814 27651 V1087 V4364 826 9951 462 2724 833 42789 5409 27801 863 5770 29181 4255
where dx1 records the primary diagnosis and dx2 and dx3 record secondary diagnoses.
Suppose you want to determine which records have a primary diagnosis for diabetes, indicated by codes starting with 250. You only need to type
. icd9 generate diabetes = dx1, range(250*) . list, clean noobs recid dx1 dx2 dx3 diabetes 84 4414 99811 4275 0 105 25013 3572 25063 1 255 51909 1489 V146 0 651 9678 E8528 0 696 V271 64421 65641 0 779 5409 V1582 V1062 0 814 27651 V1087 V4364 0 826 9951 462 2724 0 833 42789 5409 27801 0 863 5770 29181 4255 0
You might want to check all diagnosis fields. For example, suppose your study protocol calls for excluding records for patients with a history of malignant cancer (codes starting V10) or who came to the hospital to give birth (codes starting V27). While there are different ways to handle multiple diagnosis codes, the fastest way, especially for large datasets, is to use a loop.
Here we loop through the three diagnosis variables, generate three indicators for whether the code corresponds to malignant cancer or giving birth, and name them excl_dx#.
. foreach dxnum of varlist dx1 dx2 dx3 { 2. icd9 generate excl_`dxnum' = `dxnum', range(V10* V27*) 3. } . list, clean noobs recid dx1 dx2 dx3 diabetes excl_dx1 excl_dx2 excl_dx3 84 4414 99811 4275 0 0 0 0 105 25013 3572 25063 1 0 0 0 255 51909 1489 V146 0 0 0 0 651 9678 E8528 0 0 0 . 696 V271 64421 65641 0 1 0 0 779 5409 V1582 V1062 0 0 0 1 814 27651 V1087 V4364 0 0 1 0 826 9951 462 2724 0 0 0 0 833 42789 5409 27801 0 0 0 0 863 5770 29181 4255 0 0 0 0
You can then take the sum across the excl_dx# for the patient record to get a single exclusion indicator.
Dropping all of the new excl_dx# variables is not strictly necessary, but they are not needed and it saves some space.
. egen exclude = rowtotal(excl_dx*) . drop excl_dx* . list, clean noobs recid dx1 dx2 dx3 diabetes exclude 84 4414 99811 4275 0 0 105 25013 3572 25063 1 0 255 51909 1489 V146 0 0 651 9678 E8528 0 0 696 V271 64421 65641 0 1 779 5409 V1582 V1062 0 1 814 27651 V1087 V4364 0 1 826 9951 462 2724 0 0 833 42789 5409 27801 0 0 863 5770 29181 4255 0 0
The same principles apply to ICD-9-CM procedure codes and to ICD-10 diagnosis codes, so choose the command that is appropriate for the codes that you have.