Stata | FAQ: Labeling ICD codes with their descriptions

Home / Resources & support / FAQs / Labeling ICD codes with their descriptions

How do I label my diagnosis or procedure codes with their descriptions?

Title		Labeling ICD codes with their descriptions
Author		Rebecca Pope, StataCorp

While you cannot label ICD-9-CM or ICD-10 codes directly, you can still display information about their descriptions. There are two options:

Store the descriptions in a new string variable.
Create a corresponding numeric variable and label its values.

Suppose you have data containing patient record IDs and ICD-9-CM diagnosis codes that look like

     recid      dx  
    150781   9110   
    150913   4241   
    151088   4254   
    151125   9033   
    151154   78650  
    151165   8028   
    151207   51881  
    151344   3051   
    151415   4321   
    151487   V140

Option 1: Store the descriptions in a new string variable

Stata's icd9 generate, icd9p generate, and icd10 generate commands with the description option create a new variable with the description of the corresponding code.

. icd9 generate descr = dx, description

. list, clean noobs

     recid      dx                      descr  
    150781   9110              abrasion trunk  
    150913   4241       aortic valve disorder  
    151088   4254     prim cardiomyopathy nec  
    151125   9033        injury ulnar vessels  
    151154   78650             chest pain nos  
    151165   8028    fx facial bone nec-close  
    151207   51881   acute respiratry failure  
    151344   3051        tobacco use disorder  
    151415   4321         subdural hemorrhage  
    151487   V140       hx-penicillin allergy  

. describe

Contains data from icd9exdata.dta
  obs:            10                          
 vars:             3                          20 Oct 2015 18:02
 size:           330                          (_dta has notes)
-------------------------------------------------------------------------------
              storage   display    value
variable name   type    format     label      variable label
-------------------------------------------------------------------------------
recid           float   %9.0g                 Patient record ID
dx              str5    %9s                   Diagnosis
descr           str24   %24s                  label for dx
-------------------------------------------------------------------------------
Sorted by: recid
     Note: Dataset has changed since last saved.

With the descriptions added, the size of the dataset is 330 bytes. We may be able to reduce the size of the dataset using encode.

Option 2: Create a corresponding numeric variable and label its values

To add a label to a numeric value, first create a string variable with the diagnosis description, then use encode.

. icd9 generate descr = dx, description long

. encode descr, generate(dxlabeled) label(descrip)

The new variable is long by default, but we can use compress to make sure it is stored in the smallest possible numeric type.

. compress 
  variable dxlabeled was long now byte
  (30 bytes saved)

Finally, drop the created string variable because it is unnecessary.

. drop descr

While you could also remove the original, unencoded diagnosis variable, you should keep it if you plan to do data manipulation based on the codes or if you might need to combine your dataset with new data in the future. Our dataset now looks like this:

. list, clean noobs

     recid      dx                         dxlabeled  
    150781   9110              911.0  abrasion trunk  
    150913   4241       424.1  aortic valve disorder  
    151088   4254     425.4  prim cardiomyopathy nec  
    151125   9033        903.3  injury ulnar vessels  
    151154   78650             786.50 chest pain nos  
    151165   8028    802.8  fx facial bone nec-close  
    151207   51881   518.81 acute respiratry failure  
    151344   3051        305.1  tobacco use disorder  
    151415   4321         432.1  subdural hemorrhage  
    151487   V140       V14.0  hx-penicillin allergy

In general, using encode results in a smaller dataset than adding a variable that contains the descriptions.

. describe

Contains data from icd9exdata.dta
  obs:            10                          
 vars:             3                          20 Oct 2015 18:02
 size:           100                          (_dta has notes)
-------------------------------------------------------------------------------
              storage   display    value
variable name   type    format     label      variable label
-------------------------------------------------------------------------------
recid           float   %9.0g                 Patient record ID
dx              str5    %9s                   Diagnosis
dxlabeled       byte    %32.0g     descrip    label for dx
-------------------------------------------------------------------------------
Sorted by: recid
     Note: Dataset has changed since last saved.

The version of our dataset after using encode is only 100 bytes.

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Cookie Settings

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Advertising and performance cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.

How do I label my diagnosis or procedure codes with their descriptions?

Option 1: Store the descriptions in a new string variable

Option 2: Create a corresponding numeric variable and label its values

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies

Stata/MP4 Annual License (download)

How do I label my diagnosis or procedure codes with their descriptions?

Option 1: Store the descriptions in a new string variable

Option 2: Create a corresponding numeric variable and label its values

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies