| Title | Missing standard errors reported by stcox | |
| Author | Mario Cleves, StataCorp |
There are two major reasons for missing standard errors in a Cox proportional hazards regression. The first is failure to converge. Although this is rare, if in the last step of the iteration log the message “nonconcave function encountered” or “unproductive step attempted” appear, then the estimation procedure did not converge to the MLE and the results cannot be trusted.
Missing standard errors in a Cox proportional hazards regression, however, are more often due to one of four types of collinearity:
1) Covariate is collinear with the dead/censor variable.
This results in a hazard ratio of infinity (large number printed out) and a missing standard error if there is positive collinearity, or a hazard ratio of zero (large negative coefficient) and a missing standard error if there is negative collinearity.
. webuse cancer
(Patient Survival in Drug Trial)
. stset studytime, f(died)
failure event: died != 0 & died < .
obs. time interval: (0, studytime]
exit on or before: failure
------------------------------------------------------------------------------
48 total obs.
0 exclusions
------------------------------------------------------------------------------
48 obs. remaining, representing
31 failures in single record/single failure data
744 total analysis time at risk, at risk from t = 0
earliest observed entry t = 0
last observed exit t = 39
. generate copy=_d
. stcox age drug copy, exactp nolog
failure _d: died
analysis time _t: studytime
Cox regression -- exact partial likelihood
No. of subjects = 48 Number of obs = 48
No. of failures = 31
Time at risk = 744
LR chi2(3) = 59.38
Log likelihood = -62.481243 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | 1.090842 .042937 2.21 0.027 1.009851 1.178328
drug | .2851362 .1067876 -3.35 0.001 .1368565 .5940725
copy | 5.28e+15 . . . . .
------------------------------------------------------------------------------
. generate negcopy=-_d
. stcox age drug negcopy, exactp nolog
failure _d: died
analysis time _t: studytime
Cox regression -- exact partial likelihood
No. of subjects = 48 Number of obs = 48
No. of failures = 31
Time at risk = 744
LR chi2(3) = 59.38
Log likelihood = -62.481243 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | 1.090842 .042937 2.21 0.027 1.009851 1.178328
drug | .2851362 .1067876 -3.35 0.001 .1368565 .5940725
negcopy | 1.89e-16 . . . . .
------------------------------------------------------------------------------
2) Covariate is collinear with the time variable.
This results in a hazard ratio close to one (coefficient is zero) and a missing standard error.
. clear
. set obs 1000
obs was 0, now 1000
. generate t=_n
. stset t
failure event: (assumed to fail at time=t)
obs. time interval: (0, t]
exit on or before: failure
------------------------------------------------------------------------------
1000 total obs.
0 exclusions
------------------------------------------------------------------------------
1000 obs. remaining, representing
1000 failures in single record/single failure data
500500 total analysis time at risk, at risk from t = 0
earliest observed entry t = 0
last observed exit t = 1000
. generate copy=_t
. stcox copy
failure _d: 1 (meaning all fail)
analysis time _t: t
Iteration 0: log likelihood = -5912.1282
Iteration 1: log likelihood = -4537.5754
Iteration 2: log likelihood = -3821.8484
Iteration 3: log likelihood = -3430.1547
Iteration 4: log likelihood = -3427.9073
Iteration 5: log likelihood = -3344.6335
Refining estimates:
Iteration 0: log likelihood = -3312.0701
Iteration 1: log likelihood = -2920.7381
Iteration 2: log likelihood = -2709.5843
Iteration 3: log likelihood = -2701.5327
Cox regression -- no ties
No. of subjects = 1000 Number of obs = 1000
No. of failures = 1000
Time at risk = 500500
LR chi2(1) = 6421.19
Log likelihood = -2701.5327 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
copy | .9343625 . . . . .
------------------------------------------------------------------------------
3) Covariate is collinear with the entry-time variable.
This results in a hazard ratio close to one (coefficient is zero) and a missing standard error.
. clear
. set obs 1000
obs was 0, now 1000
. generate t0=_n-5
. generate t=_n
. stset t, enter(t0)
failure event: (assumed to fail at time=t)
obs. time interval: (0, t]
enter on or after: time t0
exit on or before: failure
------------------------------------------------------------------------------
1000 total obs.
0 exclusions
------------------------------------------------------------------------------
1000 obs. remaining, representing
1000 failures in single record/single failure data
4990 total analysis time at risk, at risk from t = 0
earliest observed entry t = 0
last observed exit t = 1000
. generate copy=_t0
. stcox copy
failure _d: 1 (meaning all fail)
analysis time _t: t
enter on or after: time t0
Iteration 0: log likelihood = -1606.1782
Iteration 1: log likelihood = -1545.3983
Iteration 2: log likelihood = -1540.1655
Refining estimates:
Iteration 0: log likelihood = -1541.2987
Iteration 1: log likelihood = -1484.0017
Iteration 2: log likelihood = -1473.3656
Iteration 3: log likelihood = -1470.0384
Iteration 4: log likelihood = -1469.6364
Iteration 5: log likelihood = -1463.425
Cox regression -- no ties
No. of subjects = 1000 Number of obs = 1000
No. of failures = 1000
Time at risk = 4990
LR chi2(1) = 285.51
Log likelihood = -1463.425 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
copy | .9293137 . . . . .
------------------------------------------------------------------------------
4) Covariate does not vary within death event risk sets.
This is a complicated form of collinearity wherein a covariate varies overall, but for each death event, it does not vary within the associated risk set.
This results in a hazard ratio of one (coefficient is zero) and a missing standard error.
. clear
. input id t0 t dead x
id t0 t dead x
1. 1 0 1 1 6.18
2. 2 0.5 1 1 6.18
3. 3 1 6 1 5.55
4. 4 3 7 0 5.55
5. end
. stset t, failure(dead) enter(t0)
failure event: dead != 0 & dead < .
obs. time interval: (0, t]
enter on or after: time t0
exit on or before: failure
------------------------------------------------------------------------------
4 total obs.
0 exclusions
------------------------------------------------------------------------------
4 obs. remaining, representing
3 failures in single record/single failure data
10.5 total analysis time at risk, at risk from t = 0
earliest observed entry t = 0
last observed exit t = 7
. list
+-------------------------------------------------+
| id t0 t dead x _st _d _t _t0 |
|-------------------------------------------------|
1. | 1 0 1 1 6.18 1 1 1 0 |
2. | 2 .5 1 1 6.18 1 1 1 .5 |
3. | 3 1 6 1 5.55 1 1 6 1 |
4. | 4 3 7 0 5.55 1 0 7 3 |
+-------------------------------------------------+
. stcox x
failure _d: dead
analysis time _t: t
enter on or after: time t0
Iteration 0: log likelihood = -2.0794415
Refining estimates:
Iteration 0: log likelihood = -2.0794415
Cox regression -- Breslow method for ties
No. of subjects = 4 Number of obs = 4
No. of failures = 3
Time at risk = 10.5
LR chi2(1) = 0.00
Log likelihood = -2.0794415 Prob > chi2 = 1.0000
------------------------------------------------------------------------------
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x | 1 . . . . .
------------------------------------------------------------------------------
Coefficients for the variables that have (any form of) collinearity cannot be estimated. Leaving them in or deleting them from the model results in the same likelihood value and does not alter the results for the noncollinear variables.
Although the first three forms of collinearity can be easily assessed, the fourth requires that the appropriate risk sets be formed. This task is facilitated by the use of the program st_rpool, written by Bill Gould, that can be downloaded from Stata’s website.
To obtain st_rpool, type in Stata:
. net from http://www.stata.com . net cd users/wgould . net describe st_rpool . net install st_rpool
or,
Let’s use st_rpool to look at the values of the covariate x in the risk sets:
. clear
. input id t0 t dead x
id t0 t dead x
1. 1 0 1 1 6.18
2. 2 0.5 1 1 6.18
3. 3 1 6 1 5.55
4. 4 3 7 0 5.55
5. end
. stset t, failure(dead) enter(t0)
failure event: dead != 0 & dead < .
obs. time interval: (0, t]
enter on or after: time t0
exit on or before: failure
------------------------------------------------------------------------------
4 total obs.
0 exclusions
------------------------------------------------------------------------------
4 obs. remaining, representing
3 failures in single record/single failure data
10.5 total analysis time at risk, at risk from t = 0
earliest observed entry t = 0
last observed exit t = 7
. list
+-------------------------------------------------+
| id t0 t dead x _st _d _t _t0 |
|-------------------------------------------------|
1. | 1 0 1 1 6.18 1 1 1 0 |
2. | 2 .5 1 1 6.18 1 1 1 .5 |
3. | 3 1 6 1 5.55 1 1 6 1 |
4. | 4 3 7 0 5.55 1 0 7 3 |
+-------------------------------------------------+
. st_rpool set
. sort set id
. list, sepby(set)
+--------------------------------------+
| id t0 x _d _t _t0 set |
|--------------------------------------|
1. | 1 0 6.18 1 1 0 1 |
2. | 2 .5 6.18 1 1 .5 1 |
|--------------------------------------|
3. | 3 1 5.55 1 6 1 2 |
4. | 4 3 5.55 0 7 3 2 |
+--------------------------------------+