On Thursday, Giorgio wrote:
> I've been estimating a cox model with frailty. Unfortunately when I use
> the option efron, the cumulative hazard is not an increasing function. Enzo
> Coviello has told me that one year ago there was a similar comunication.
> Is there still a bug or it does depend on frailty option?
>
Unfortunately, this is a bug. It occurs when computing the cummulative
hazard using the option -efron- (with or without -frailty-) when there
are ties of more than two observations in the dataset. This is on our
list of things-to-do.
In the mean time, here is the way to calculate the cummulative hazard.
The formula for the cummulative hazard is on page 150 [ST] stcox
(version 8 manual).
Let's start with a dataset without ties, so that we are certain we are
calculating it correctly. This is the easiest possible case - no
covariates.
--------------------------------------
clear
input t
60
63
66
68
70
77
84
91
94
98
101
105
108
109
112
115
126
143
153
161
164
178
end
set seed 12345
gen fail=uniform()>0.5
stset t, fail(fail)
sts gen atrisk=n
sort _t _d
gen mych =sum(_d/atrisk)
by _t: replace mych=mych[_N]
qui stcox, estimate nohr basech(chbres)
qui stcox, estimate nohr basech(chefron) efron
list _t _d mych chbres chefron, sepby(_t)
--------------------------------------
Now let's add a covariate. This is a little more work. Here I am doing
it twice, once for the Breslow method of handling ties, and again for
the Efron method:
--------------------------------------
clear
input t
60
63
66
68
70
77
84
91
94
98
101
105
108
109
112
115
126
143
153
161
164
178
end
set seed 12345
gen fail=uniform()>0.5
gen drug=int(uniform()*3+1)
stset t, fail(fail)
sort _t _d
qui stcox drug, estimate nohr basech(chbres)
predict xbbres, xb
gen expxb=exp(xbbres)
gsort -_t -_d
gen bot=sum(expxb)
sort _t _d
by _t: replace bot=bot[1]
gen mychbres =sum(_d/bot)
by _t: replace mychbres=mychbres[_N]
list _t _d mychbres chbres, sepby(_t)
qui stcox drug, estimate nohr basech(chefron) efron
predict xbef, xb
gen expxbef=exp(xbef)
gsort -_t -_d
gen botef=sum(expxbef)
sort _t _d
by _t: replace botef=botef[1]
gen mychef =sum(_d/botef)
by _t: replace mychef=mychef[_N]
list _t _d mychef chef chbres, sepby(_t)
--------------------------------------
At this point, we should be convinced we are calculating it correctly
when there are no ties.
So, if you add one more observation, say another 63, you should see that
my method still agrees with Stata for both Breslow and Efron. If you
then add a third observation at 63, you should see that my method again
agrees with Stata for Breslow.
The above method for computing the cummulative hazard still works when
using the options -frailty() shared()- with -stcox-.
--may
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/