I have generated a "new" predictive model ( with multiple variables)
and am interested in comparing the ROC curves from my new model to one
generated using a single "old" variable (which is not incorporated in
my new model). Understanding the downsides to stepwise variable
reduction in prediction models I have nontheless chosen to use it.
Ive bootstrapped the "optimism" (internal validation) in the ROC from
my new model with the following code:
gen freq = .
program define optimism, rclass
version 9
bsample 1023, weight(freq)
stepwise, pe(0.049) pr(0.05) lr : logit y x1 x2 x3 x4 x5
x6.....x15 [fw=freq]
lroc, nograph
return scalar area1 = r(area)
local a1 = r(area)
predict p
roctab y p /*calculate aROC on the full data using model
derived on bootstrap sample */
return scalar area2 = r(area)
local a2 = r(area)
return scalar dif = `a1' - `a2'
drop p
end
simulate area1=r(area1) area2=r(area2) dif=r(dif), reps(200)
seed(1234): optimism
I then report the optimism corrected ROC as that generated by the original data:
stepwise, pe(0.049) pr(0.05) lr : logit y x1 x2 x3 x4 x5 x6.....x15
lroc, nograph
The AUC for this equals 0.77 and the optimism generated by the above
bootstrap = 0.03
then my optimism corrected estimate is = 0.77 - 0.3 = 0.74.
The AUC generated by using the alternative covariate on the same data = 0.70.
I would like to statistically compare the two AUCs, but the problem is
that by manipulating the AUC post test (subtracting the bootstrap
optimism) I can no longer
perform statistical tests of comparison between the two models.
Does anyone know of a way to be able to perform tests of comparison
between ROC curves on the same data after the
areas have been manipulated like they have been here?
Thanks in advance.
-CC
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/