Stata’s suite for ROC analysis consists of: roctab, roccomp, rocfit, rocgold, rocreg, and rocregplot.
Stata’s roctab provides nonparametric estimation of the ROC curve, and produces Bamber and Hanley confidence intervals for the area under the ROC curve.
Stata’s roccomp provides tests of equality of ROC areas. It can estimate nonparametric and parametric binormal ROC curves.
rocfit fits maximum likelihood models for a single classifier, an indicator of the latent binormal variable for the true status.
rocgold performs tests of equality of ROC area, against a “gold standard” ROC curve, and can adjust significance levels for multiple tests across classifiers via Sidak’s correction.
rocreg performs ROC regression, that is, it can adjust both sensitivity and specifity for prognostic factors such as age and gender; it is by far the most general of all the ROC commands.
rocregplot draws ROC curves as modeled by rocreg. ROC curves may be drawn across covariate values, across classifiers, and both.
Norton et al. (2000) examined a neo-natal audiology study on hearing impairment. A hearing test was applied to children aged 30 to 53 months. It is believed that the classifier y1 (DPOAE 65 at 2kHz) becomes more accurate at older ages.
We use rocreg to fit a maximum likelihood model for this situation. The extra effect of current age on y1 when the child has hearing impairment is estimated by specifying roccov(). The control population effect of current age and gender of the child is estimated with the ctrlcov() option.
. webuse nnhs, clear (Norton - neonatal audiology data) . rocreg d y1, roccov(currage) ctrlcov(currage male) cluster(id) probit ml nolog Covariate control : linear regression Control variables : currage male Control standardization: normal ROC method : parametric Link: probit Status : d Classifiers: y1 Classifier : y1 Covariate control adjustment model: (Std. err. adjusted for 2,741 clusters in id)
Robust | ||
Coefficient std. err. z P>|z| [95% conf. interval] | ||
casecov | ||
currage | .494211 .2463657 2.01 0.045 .0113431 .977079 | |
_cons | -15.00403 9.384911 -1.60 0.110 -33.39812 3.390058 | |
casesd | ||
_cons | 8.49794 .5366836 15.83 0.000 7.44606 9.549821 | |
ctrlcov | ||
currage | -.2032048 .0388917 -5.22 0.000 -.279431 -.1269785 | |
male | .2369359 .2573664 0.92 0.357 -.267493 .7413648 | |
_cons | -1.23534 1.487668 -0.83 0.406 -4.151116 1.680436 | |
ctrlsd | ||
_cons | 7.749156 .1113006 69.62 0.000 7.531011 7.967301 | |
Robust | ||
Coefficient std. err. z P>|z| [95% conf. interval] | ||
y1 | ||
i_cons | -1.765608 1.105393 -1.60 0.110 -3.932138 .4009225 | |
currage | .0581566 .0290177 2.00 0.045 .0012828 .1150303 | |
s_cons | .9118864 .0586884 15.54 0.000 .7968593 1.026913 | |
The results show us that current age has a borderline significant positive effect on the ROC curve (p-value = 0.045). We now use rocregplot to draw the ROC curves for ages of 50 and 40 months, and add some graph options to make the legend pretty and place it inside the graph.
. rocregplot, at1(currage=40) at2(currage=50) legend(order(3 "reference" 1 "40 mos." 2 "50 mos.") ring(0) rows(3) pos(5)) title("ROC, by age") xsize(4) ysize(4)
The graph indicates that the area under the curve (AUC) for 50 months is clearly larger than that for 40 months, and this can be formally verified by using testnl after rocreg; see [R] rocregplot for a related example.
Two other classifiers were examined in the study, y2 (TEOAE 80 at 2kHz) and y3 (ABR). We will use rocgold to compare the ROC areas of y2 and y3, assuming a “gold standard” classifier of y1 (DPOAE 65 at 2kHz). The sidak option provides adjusted p-values, reflecting the two tests that are being performed.
. rocgold d y1 y2 y3, sidak graph summary aspectratio(1)
ROC Sidak area Std. err. chi2 df Pr>chi2 Pr>chi2 |
y1 (standard) 0.6306 0.0240 y2 0.6006 0.0250 2.0759 1 0.1496 0.2769 y3 0.6081 0.0259 0.4931 1 0.4826 0.7323 |
We cannot reject the hypotheses that y2 and y3 have the same area as y1. Both the adjusted and unadjusted p-values support this.
Wieand et. al. (1989) examined a pancreatic cancer study. No covariates were recorded, and the study was a case–control study.
We use rocreg to estimate the ROC curve for the classifier y2 (CA 125) that was examined. A nonparametric estimate is used, and we bootstrap to obtain standard errors. We estimate the sensitivity for the specificity value of .6 through the roc() option, which takes argument 1-specificity. The partial area under the curve (pAUC), the area under the ROC curve up to a given 1-specificity value, is estimated for the specificity of .4 with the pauc() option. The case–control sampling of the study is indicated to rocreg via the bootcc option.
. use https://research.fredhutch.org/content/dam/stripe/diagnostic-biomarkers -statistical-center/files/wiedat2b.dta, clear (S. Wieand - Pancreatic cancer diagnostic marker data) . rocreg d y2, roc(.4) pauc(.6) bseed(8378923) bootcc nodots Bootstrap results Number of strata = 2 Number of obs = 141 Replications = 1,000 Nonparametric ROC estimation Control standardization: empirical ROC method : empirical ROC curve Status : d Classifier: y2
Observed Bootstrap | ||
ROC | coefficient Bias std. err. [95% conf. interval] | |
.4 | .7555556 -.0118111 .0767123 .6052022 .9059089 (N) | |
.5666667 .8666667 (P) | ||
.5555556 .8555555 (BC) | ||
Observed Bootstrap | ||
pAUC | coefficient Bias std. err. [95% conf. interval] | |
.6 | .3326797 .0033456 .0393666 .2555227 .4098368 (N) | |
.2583878 .4101961 (P) | ||
.2419608 .3976471 (BC) | ||
We can use rocregplot to see the ROC curve for y2 (CA 125). We also ask for normal-based confidence band for ROC value at the specificity of .6.
. rocregplot, plot1opts(msymbol(i)) legend(order(2 "reference" 1 "CA 125") ring(0) rows(2) pos(5)) xsize(4) ysize(4) title("ROC, CA 125")
Norton, S. J., M. P. Gorga, J. E. Widen, R. C. Folsom, Y. Sininger B. Cone-Wesson, B. R. Vohr, K. Mascher, and K. Fletcher. 2000. Identification of neonatal hearing impairment: Evaluation of transient evoked otoacoustic emission, distortion product otoacoustic emission, and auditory brain stem response test performance. Ear and Hearing 21: 508–528.
Wieand, S., M. H. Gail, B. R. James, and K. L. James. 1989. A family of nonparametric statistics for comparing diagnostic markers with paired or unpaired data. Biometrika 76: 585–592.
Pepe, M. S. 2003. The Statistical Evaluation of Medical Tests for Classification and Prediction. New York: Oxford University Press.