Receiver operating characteristics (ROC)

Order

<- See Stata's other features

Stata’s suite for ROC analysis consists of: roctab, roccomp, rocfit, rocgold, rocreg, and rocregplot.

Stata’s roctab provides nonparametric estimation of the ROC curve, and produces Bamber and Hanley confidence intervals for the area under the ROC curve.

Stata’s roccomp provides tests of equality of ROC areas. It can estimate nonparametric and parametric binormal ROC curves.

rocfit fits maximum likelihood models for a single classifier, an indicator of the latent binormal variable for the true status.

rocgold performs tests of equality of ROC area, against a “gold standard” ROC curve, and can adjust significance levels for multiple tests across classifiers via Sidak’s correction.

rocreg performs ROC regression, that is, it can adjust both sensitivity and specifity for prognostic factors such as age and gender; it is by far the most general of all the ROC commands.

rocregplot draws ROC curves as modeled by rocreg. ROC curves may be drawn across covariate values, across classifiers, and both.

Let's see it work

Norton et al. (2000) examined a neo-natal audiology study on hearing impairment. A hearing test was applied to children aged 30 to 53 months. It is believed that the classifier y1 (DPOAE 65 at 2kHz) becomes more accurate at older ages.

We use rocreg to fit a maximum likelihood model for this situation. The extra effect of current age on y1 when the child has hearing impairment is estimated by specifying roccov(). The control population effect of current age and gender of the child is estimated with the ctrlcov() option.

. webuse nnhs, clear
(Norton - neonatal audiology data)

. rocreg d y1, roccov(currage) ctrlcov(currage male) cluster(id) probit 
     ml nolog

Covariate control      : linear regression
Control variables      : currage male
Control standardization: normal
ROC method             : parametric               Link: probit

  Status     : d
  Classifiers: y1

  Classifier : y1
  Covariate control adjustment model:
                                 (Std. err. adjusted for 2,741 clusters in id)




                             Robust

               Coefficient  std. err.      z    P>|z|     [95% conf. interval]

   

casecov       

     currage      .494211   .2463657     2.01   0.045     .0113431     .977079

       _cons    -15.00403   9.384911    -1.60   0.110    -33.39812    3.390058

   

casesd        

       _cons      8.49794   .5366836    15.83   0.000      7.44606    9.549821

   

ctrlcov       

     currage    -.2032048   .0388917    -5.22   0.000     -.279431   -.1269785

        male     .2369359   .2573664     0.92   0.357     -.267493    .7413648

       _cons     -1.23534   1.487668    -0.83   0.406    -4.151116    1.680436

   

ctrlsd        

       _cons     7.749156   .1113006    69.62   0.000     7.531011    7.967301



   Status    : d
   ROC Model :
                                 (Std. err. adjusted for 2,741 clusters in id)




                             Robust

               Coefficient  std. err.      z    P>|z|     [95% conf. interval]

   

y1            

      i_cons    -1.765608   1.105393    -1.60   0.110    -3.932138    .4009225

     currage     .0581566   .0290177     2.00   0.045     .0012828    .1150303

      s_cons     .9118864   .0586884    15.54   0.000     .7968593    1.026913

The results show us that current age has a borderline significant positive effect on the ROC curve (p-value = 0.045). We now use rocregplot to draw the ROC curves for ages of 50 and 40 months, and add some graph options to make the legend pretty and place it inside the graph.

. rocregplot, at1(currage=40) at2(currage=50) legend(order(3 "reference" 1 "40 mos." 2 "50 mos.") 
     ring(0) rows(3) pos(5)) title("ROC, by age") xsize(4) ysize(4)

The graph indicates that the area under the curve (AUC) for 50 months is clearly larger than that for 40 months, and this can be formally verified by using testnl after rocreg; see [R] rocregplot for a related example.

Two other classifiers were examined in the study, y2 (TEOAE 80 at 2kHz) and y3 (ABR). We will use rocgold to compare the ROC areas of y2 and y3, assuming a “gold standard” classifier of y1 (DPOAE 65 at 2kHz). The sidak option provides adjusted p-values, reflecting the two tests that are being performed.

. rocgold d y1 y2 y3, sidak graph summary aspectratio(1)


 

                       ROC                                                Sidak
                      area     Std. err.       chi2    df   Pr>chi2     Pr>chi2

 

y1 (standard)       0.6306       0.0240
y2                  0.6006       0.0250      2.0759     1    0.1496      0.2769
y3                  0.6081       0.0259      0.4931     1    0.4826      0.7323

We cannot reject the hypotheses that y2 and y3 have the same area as y1. Both the adjusted and unadjusted p-values support this.

Wieand et. al. (1989) examined a pancreatic cancer study. No covariates were recorded, and the study was a case–control study.

We use rocreg to estimate the ROC curve for the classifier y2 (CA 125) that was examined. A nonparametric estimate is used, and we bootstrap to obtain standard errors. We estimate the sensitivity for the specificity value of .6 through the roc() option, which takes argument 1-specificity. The partial area under the curve (pAUC), the area under the ROC curve up to a given 1-specificity value, is estimated for the specificity of .4 with the pauc() option. The case–control sampling of the study is indicated to rocreg via the bootcc option.

. use https://research.fredhutch.org/content/dam/stripe/diagnostic-biomarkers
     -statistical-center/files/wiedat2b.dta, clear
(S. Wieand - Pancreatic cancer diagnostic marker data)

. rocreg d y2, roc(.4) pauc(.6) bseed(8378923) bootcc nodots

Bootstrap results

Number of strata = 2                            Number of obs     =        141
                                                Replications      =      1,000

Nonparametric ROC estimation

Control standardization: empirical
ROC method             : empirical

ROC curve

   Status    : d
   Classifier: y2




                  Observed               Bootstrap

         ROC   coefficient       Bias    std. err.     [95% conf. interval]

   

          .4      .7555556  -.0118111   .0767123     .6052022   .9059089  (N)

                                                      .5666667   .8666667  (P)

                                                      .5555556   .8555555 (BC)




Partial area under the ROC curve

   Status    : d
   Classifier: y2




                  Observed               Bootstrap

        pAUC   coefficient       Bias    std. err.     [95% conf. interval]

   

          .6      .3326797   .0033456    .0393666     .2555227   .4098368  (N)

                                                      .2583878   .4101961  (P)

                                                      .2419608   .3976471 (BC)

We can use rocregplot to see the ROC curve for y2 (CA 125). We also ask for normal-based confidence band for ROC value at the specificity of .6.

. rocregplot, plot1opts(msymbol(i)) legend(order(2 "reference" 1 "CA 125") 
     ring(0) rows(2) pos(5)) xsize(4) ysize(4) title("ROC, CA 125")

References

Norton, S. J., M. P. Gorga, J. E. Widen, R. C. Folsom, Y. Sininger B. Cone-Wesson, B. R. Vohr, K. Mascher, and K. Fletcher. 2000. Identification of neonatal hearing impairment: Evaluation of transient evoked otoacoustic emission, distortion product otoacoustic emission, and auditory brain stem response test performance. Ear and Hearing 21: 508–528.

Wieand, S., M. H. Gail, B. R. James, and K. L. James. 1989. A family of nonparametric statistics for comparing diagnostic markers with paired or unpaired data. Biometrika 76: 585–592.

Pepe, M. S. 2003. The Statistical Evaluation of Medical Tests for Classification and Prediction. New York: Oxford University Press.

Products

New in Stata 19

Why Stata

All features

Disciplines

Stata/MP

StataNow

Order Stata

Purchase

Order Stata

Bookstore

Stata Press

Stata Journal

Gift Shop

Learn

Free webinars

NetCourses

Classroom and web training

Organizational training

Video tutorials

Third-party courses

Web resources

Teaching with Stata

Support

Training

Video tutorials

FAQs

Statalist: The Stata Forum

Resources

Technical support

Customer service

Alerts

Company

News and events

Customer service

Careers

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Cookie Settings

Privacy policy

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Required cookies

Advertising cookies

Required cookies

These cookies are essential for our website to function and do not store any personally identifiable information. These cookies cannot be disabled.
Advertising and performance cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.

Accept Cookies


		Robust
		Coefficient std. err. z P>\|z\| [95% conf. interval]

casecov
currage		.494211 .2463657 2.01 0.045 .0113431 .977079
_cons		-15.00403 9.384911 -1.60 0.110 -33.39812 3.390058

casesd
_cons		8.49794 .5366836 15.83 0.000 7.44606 9.549821

ctrlcov
currage		-.2032048 .0388917 -5.22 0.000 -.279431 -.1269785
male		.2369359 .2573664 0.92 0.357 -.267493 .7413648
_cons		-1.23534 1.487668 -0.83 0.406 -4.151116 1.680436

ctrlsd
_cons		7.749156 .1113006 69.62 0.000 7.531011 7.967301


		Robust
		Coefficient std. err. z P>\|z\| [95% conf. interval]

y1
i_cons		-1.765608 1.105393 -1.60 0.110 -3.932138 .4009225
currage		.0581566 .0290177 2.00 0.045 .0012828 .1150303
s_cons		.9118864 .0586884 15.54 0.000 .7968593 1.026913


		Observed Bootstrap
ROC		coefficient Bias std. err. [95% conf. interval]

.4		.7555556 -.0118111 .0767123 .6052022 .9059089 (N)
		.5666667 .8666667 (P)
		.5555556 .8555555 (BC)


		Observed Bootstrap
pAUC		coefficient Bias std. err. [95% conf. interval]

.6		.3326797 .0033456 .0393666 .2555227 .4098368 (N)
		.2583878 .4101961 (P)
		.2419608 .3976471 (BC)