Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: ROC-curves
From
"Roger B. Newson" <[email protected]>
To
[email protected]
Subject
Re: st: ROC-curves
Date
Mon, 21 Oct 2013 22:08:40 +0100
The main problem with confidence intervals for the area under a ROC
generated from a logistic regression is that, if you estimate your ROC
from the same data in which you fitted your logistic regression model,
then you will probably be over-optimistic, as the parameters have been
chosen to fit specifically that set of data. If you want your ROC area
to have confidence limits which you can really be confident about, then
it is a good idea to randomize your data into a training set and a test
set, and to fit your logistic model to the training set, and to estimate
its ROC area using out-of-sample prediction in the test set.
Newson (2010) discusses these issues with Cox regression and other
survival models. As stated in the first paragraph of Section 5 of this
reference, the procedure with non-survival models (like logistic
regression) is similar, but similar.
I hope this helps.
Best wishes
Roger
References
Newson RB. Comparing the predictive power of survival models using
Harrell’s c or Somers’ D. The Stata Journal 2010; 10(3): 339–358.
Download from
http://www.stata-journal.com/article.html?article=st0198
Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: [email protected]
Web page: http://www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:
http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/
Opinions expressed are those of the author, not of the institution.
On 18/10/2013 09:23, Seed, Paul wrote:
On 14/10/2013 18:54, Ragnhild Bergene Skråstad wrote:
> Hi!
> I investigate how different tests, in combination, can predict a given
outcome.
>
> I have made a logistic model with the command "logistic" and plotted
the ROC-curve with the command "lroc". This cave me the ROC-curve and
the AUC. I wonder:
> - how can I get the 95 % CI for this AUC?
> and
> - I would like to get the sensitivity at a given fixed false-positive
rate. Do I have to get all the coordinates on the ROC curve and identify
the one at the FPR at interest- and if so, how do I do that, or is it a
direct way to do this?
> best wishes
> Ragnhild B Skråstad
The simplest way to get CI for a roc curve following logistic regression
is to use -predict- and -roctab-:
* Start Stata commands *
logistic outcome <predictors>
capture drop pred
predict pred
roctab outcome pred
* End Stata commands *
* outcome and <predictors> are replaced as appropriate.
Much quicker and less trouble than bootstrapping.
To find the appropriate cutpoint for a given sensitivity you can use -centile- with -if-
centile pred if outcome == 1, centile(90)
Likewise for specificity
centile pred if outcome == 0, centile(10)
Best wishes,
Paul T Seed, Women's Health, KCL
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/