Dear Statalist:
Has anyone developed faster versions of roctab or roccomp (using DeLong et
al.'s estimated standard errors)? The versions available in Stata 8 (as
well as in earlier versions of Stata) run very slowly when used with large
samples (e.g., 500-1000 observations). For example, in simulations I have
been performing, it generally takes about 24 hours to complete 1000
repetitions of roccomp in samples consisting of 1000 observations each. I
suspect the problem is related to the fact that all possible pairwise
comparisons are being made between scores for true-positive and
true-negative subjects.
I think James's problem is indeed that the calculation of the ROC curve
uses loops over observations. This is definitely a problem with my
-somersd- package, downloadable from SSC, which calculates Somers' D, which
is related to the ROC area by the formula D=2A-1 (where D is Somers' D and
A is the ROC area). I plan in the long run to write a successor package to
-somersd- using plugins, but this is not expected to happen immediately. I
think ROC curve calculation might equally be improved by using plugins, but
I don't know if StataCorp plan to do this soon.