Ashwin Ananthakrishnan wrote:
I'm trying to use logistic regression of a dichotomous outcomes (disease - Y/N)
against several predictors which are continuous variables (such as age,
hemoglobin, etc.). But I am trying to also calculate a risk score where I would
like to include even continuous predictors as a dichotomous variable (for
example, Age > 45).
How do I statistically select the cut-offs for the continuous predictor
variables? I would like to choose cut offs that would maximize the area under
the curve for each of the predictors.
Is the only way to manually create different cut offs (for example - age>40, age
> 42, age > 45, age > 50) and find the ROC by using the lroc command?
Is there a way for Stata to do this automatically and give me the output of
which cut off maximizes the ROC for each of the continuous variables?
--------------------------------------------------------------------------------
What you want is not difficult to do in Stata--it might already be available in
a module somewhere. But before you get too far along that path, I recommend
that you search R-Help list archives for Frank E. Harrell, Jr.'s postings there
on the wisdom and utility of doing this. There are alternatives to ROC
AUC-maximizing cut-offs that he recommends as more suitable for risk prediction.
If you have trouble locating the relevant posts, a good source would be his
book, _Regression Modeling Strategies_ (NY: Springer-Verlag, 2001).
Joseph Coveney
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/