Title | Missing standard errors reported by stcox | |
Author | Mario Cleves, StataCorp |
There are two major reasons for missing standard errors in a Cox proportional hazards regression. The first is failure to converge. Although this is rare, if in the last step of the iteration log the message “nonconcave function encountered” or “unproductive step attempted” appear, then the estimation procedure did not converge to the MLE and the results cannot be trusted.
Missing standard errors in a Cox proportional hazards regression, however, are more often due to one of four types of collinearity:
1) Covariate is collinear with the dead/censor variable.
This results in a hazard ratio of infinity (large number printed out) and a missing standard error if there is positive collinearity, or a hazard ratio of zero (large negative coefficient) and a missing standard error if there is negative collinearity.
. webuse cancer (Patient Survival in Drug Trial) . stset studytime, f(died) failure event: died != 0 & died < . obs. time interval: (0, studytime] exit on or before: failure ------------------------------------------------------------------------------ 48 total obs. 0 exclusions ------------------------------------------------------------------------------ 48 obs. remaining, representing 31 failures in single record/single failure data 744 total analysis time at risk, at risk from t = 0 earliest observed entry t = 0 last observed exit t = 39 . generate copy=_d . stcox age drug copy, exactp nolog failure _d: died analysis time _t: studytime Cox regression -- exact partial likelihood No. of subjects = 48 Number of obs = 48 No. of failures = 31 Time at risk = 744 LR chi2(3) = 59.38 Log likelihood = -62.481243 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | 1.090842 .042937 2.21 0.027 1.009851 1.178328 drug | .2851362 .1067876 -3.35 0.001 .1368565 .5940725 copy | 5.28e+15 . . . . . ------------------------------------------------------------------------------ . generate negcopy=-_d . stcox age drug negcopy, exactp nolog failure _d: died analysis time _t: studytime Cox regression -- exact partial likelihood No. of subjects = 48 Number of obs = 48 No. of failures = 31 Time at risk = 744 LR chi2(3) = 59.38 Log likelihood = -62.481243 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | 1.090842 .042937 2.21 0.027 1.009851 1.178328 drug | .2851362 .1067876 -3.35 0.001 .1368565 .5940725 negcopy | 1.89e-16 . . . . . ------------------------------------------------------------------------------
2) Covariate is collinear with the time variable.
This results in a hazard ratio close to one (coefficient is zero) and a missing standard error.
. clear . set obs 1000 obs was 0, now 1000 . generate t=_n . stset t failure event: (assumed to fail at time=t) obs. time interval: (0, t] exit on or before: failure ------------------------------------------------------------------------------ 1000 total obs. 0 exclusions ------------------------------------------------------------------------------ 1000 obs. remaining, representing 1000 failures in single record/single failure data 500500 total analysis time at risk, at risk from t = 0 earliest observed entry t = 0 last observed exit t = 1000 . generate copy=_t . stcox copy failure _d: 1 (meaning all fail) analysis time _t: t Iteration 0: log likelihood = -5912.1282 Iteration 1: log likelihood = -4537.5754 Iteration 2: log likelihood = -3821.8484 Iteration 3: log likelihood = -3430.1547 Iteration 4: log likelihood = -3427.9073 Iteration 5: log likelihood = -3344.6335 Refining estimates: Iteration 0: log likelihood = -3312.0701 Iteration 1: log likelihood = -2920.7381 Iteration 2: log likelihood = -2709.5843 Iteration 3: log likelihood = -2701.5327 Cox regression -- no ties No. of subjects = 1000 Number of obs = 1000 No. of failures = 1000 Time at risk = 500500 LR chi2(1) = 6421.19 Log likelihood = -2701.5327 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- copy | .9343625 . . . . . ------------------------------------------------------------------------------
3) Covariate is collinear with the entry-time variable.
This results in a hazard ratio close to one (coefficient is zero) and a missing standard error.
. clear . set obs 1000 obs was 0, now 1000 . generate t0=_n-5 . generate t=_n . stset t, enter(t0) failure event: (assumed to fail at time=t) obs. time interval: (0, t] enter on or after: time t0 exit on or before: failure ------------------------------------------------------------------------------ 1000 total obs. 0 exclusions ------------------------------------------------------------------------------ 1000 obs. remaining, representing 1000 failures in single record/single failure data 4990 total analysis time at risk, at risk from t = 0 earliest observed entry t = 0 last observed exit t = 1000 . generate copy=_t0 . stcox copy failure _d: 1 (meaning all fail) analysis time _t: t enter on or after: time t0 Iteration 0: log likelihood = -1606.1782 Iteration 1: log likelihood = -1545.3983 Iteration 2: log likelihood = -1540.1655 Refining estimates: Iteration 0: log likelihood = -1541.2987 Iteration 1: log likelihood = -1484.0017 Iteration 2: log likelihood = -1473.3656 Iteration 3: log likelihood = -1470.0384 Iteration 4: log likelihood = -1469.6364 Iteration 5: log likelihood = -1463.425 Cox regression -- no ties No. of subjects = 1000 Number of obs = 1000 No. of failures = 1000 Time at risk = 4990 LR chi2(1) = 285.51 Log likelihood = -1463.425 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- copy | .9293137 . . . . . ------------------------------------------------------------------------------
4) Covariate does not vary within death event risk sets.
This is a complicated form of collinearity wherein a covariate varies overall, but for each death event, it does not vary within the associated risk set.
This results in a hazard ratio of one (coefficient is zero) and a missing standard error.
. clear . input id t0 t dead x id t0 t dead x 1. 1 0 1 1 6.18 2. 2 0.5 1 1 6.18 3. 3 1 6 1 5.55 4. 4 3 7 0 5.55 5. end . stset t, failure(dead) enter(t0) failure event: dead != 0 & dead < . obs. time interval: (0, t] enter on or after: time t0 exit on or before: failure ------------------------------------------------------------------------------ 4 total obs. 0 exclusions ------------------------------------------------------------------------------ 4 obs. remaining, representing 3 failures in single record/single failure data 10.5 total analysis time at risk, at risk from t = 0 earliest observed entry t = 0 last observed exit t = 7 . list +-------------------------------------------------+ | id t0 t dead x _st _d _t _t0 | |-------------------------------------------------| 1. | 1 0 1 1 6.18 1 1 1 0 | 2. | 2 .5 1 1 6.18 1 1 1 .5 | 3. | 3 1 6 1 5.55 1 1 6 1 | 4. | 4 3 7 0 5.55 1 0 7 3 | +-------------------------------------------------+ . stcox x failure _d: dead analysis time _t: t enter on or after: time t0 Iteration 0: log likelihood = -2.0794415 Refining estimates: Iteration 0: log likelihood = -2.0794415 Cox regression -- Breslow method for ties No. of subjects = 4 Number of obs = 4 No. of failures = 3 Time at risk = 10.5 LR chi2(1) = 0.00 Log likelihood = -2.0794415 Prob > chi2 = 1.0000 ------------------------------------------------------------------------------ _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- x | 1 . . . . . ------------------------------------------------------------------------------
Coefficients for the variables that have (any form of) collinearity cannot be estimated. Leaving them in or deleting them from the model results in the same likelihood value and does not alter the results for the noncollinear variables.
Although the first three forms of collinearity can be easily assessed, the fourth requires that the appropriate risk sets be formed. This task is facilitated by the use of the program st_rpool, written by Bill Gould, that can be downloaded from Stata’s website.
To obtain st_rpool, type in Stata:
. net from http://www.stata.com . net cd users/wgould . net describe st_rpool . net install st_rpool
or,
Let’s use st_rpool to look at the values of the covariate x in the risk sets:
. clear . input id t0 t dead x id t0 t dead x 1. 1 0 1 1 6.18 2. 2 0.5 1 1 6.18 3. 3 1 6 1 5.55 4. 4 3 7 0 5.55 5. end . stset t, failure(dead) enter(t0) failure event: dead != 0 & dead < . obs. time interval: (0, t] enter on or after: time t0 exit on or before: failure ------------------------------------------------------------------------------ 4 total obs. 0 exclusions ------------------------------------------------------------------------------ 4 obs. remaining, representing 3 failures in single record/single failure data 10.5 total analysis time at risk, at risk from t = 0 earliest observed entry t = 0 last observed exit t = 7 . list +-------------------------------------------------+ | id t0 t dead x _st _d _t _t0 | |-------------------------------------------------| 1. | 1 0 1 1 6.18 1 1 1 0 | 2. | 2 .5 1 1 6.18 1 1 1 .5 | 3. | 3 1 6 1 5.55 1 1 6 1 | 4. | 4 3 7 0 5.55 1 0 7 3 | +-------------------------------------------------+ . st_rpool set . sort set id . list, sepby(set) +--------------------------------------+ | id t0 x _d _t _t0 set | |--------------------------------------| 1. | 1 0 6.18 1 1 0 1 | 2. | 2 .5 6.18 1 1 .5 1 | |--------------------------------------| 3. | 3 1 5.55 1 6 1 2 | 4. | 4 3 5.55 0 7 3 2 | +--------------------------------------+