Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Baseline hazard in discrete time hazards model
From
Melaku Fekadu <[email protected]>
To
[email protected]
Subject
st: Baseline hazard in discrete time hazards model
Date
Sat, 8 Mar 2014 02:19:01 +0200
Dear Statalisters,
My question is about the calculation of baseline hazard in discrete time hazards model. I want to mention that I have really looked through previous posts in this list and Stata help and manual before I ask here.
Am I wrong to assume that I can use -predict- to calculate baseline hazard after estimating a -cloglog- model? I ask this because in an earlier post (attached below) there was an answer to a similar question where it is said that one has to exponentiate the xb (linear combination) because -cloglog- models the log hazard.
I thought that -cloglog- models the hazard as follows
h = 1-exp(-exp(xb)).
Whereas the previous post suggested using
h = exp(xb) to calculate the baseline hazard.
I will be happy if anyone can elaborate on this. Is this fine to use -predict- which I assume gives h = 1-exp(-exp(xb)) for baseline hazard (assuming, of course, zero or mean values for other covariates)?
My other question is that -predict- gives very similar baseline hazard for cloglog and xtcloglog. I used -predict hxt, pu0- for xtcloglog (see syntax below). The estimated coefficients are also similar. I am afraid that I am missing something. I appreciate any elaboration on this too. The estimated baseline hazeds in the two models are:
. Cloglog Xtcloglog
1 0.154997 0.154974
2 0.265045 0.264967
3 0.275298 0.275166
4 0.561485 0.561224
5 0.734756 0.734525
6 0.748665 0.748422
I used the following syntax to calculate the baseline hazards.
sysuse cancer, clear
gen id = _n
expand studytime
bysort id : gen month = _n
// the addition (month) makes sure that _n==_N means
// the last month for that individual
bysort id (month) : gen byte dead = died==1 & _n==_N
lab var dead "binary depvar for discrete hazard model"
// further trick to safe you some typing
gen halfyr = ceil(month/6)
ta halfyr ,ge(dur)
replace halfyr = 6 if halfyr == 7
replace dur6 = dur6 + dur7
drop dur7
// when looking at the baseline hazard you need to make
// sure it refers to a meaningful group by making sure
// that the value 0 for all your explanatory variables
// refer to a meaningful value within the range of the
// data, here I centered age at 50 years.
gen c_age = age -- 50
cloglog dead drug c_age dur1 dur2 dur3 dur4 dur5 dur6, ///
nocons nolog
preserve
replace drug = 0
replace c_age = 0
predict h
tab halfyr, summarize(h) means
restore
xtcloglog dead drug c_age dur1 dur2 dur3 dur4 dur5 dur6, nolog nocons i(id)
// for the baseline here I assume zero random effect
preserve
replace drug = 0
replace c_age = 0
predict hxt, pu0
tab halfyr, summarize(hxt) means
restore
Thanks a lot,
Melaku
/////////////////////////////////////
The previous post about similar question:
Re: st: Recovering the discrete time (interval) baseline hazard function
________________________________________