Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Baseline hazard in discrete time hazards model

From	Melaku Fekadu <[email protected]>
To	[email protected]
Subject	st: Baseline hazard in discrete time hazards model
Date	Sat, 8 Mar 2014 02:19:01 +0200

Dear Statalisters,

My question is about the calculation of baseline hazard in discrete time hazards model. I want to mention that I have really looked through previous posts in this list and Stata help and manual before I ask here.

Am I wrong to assume that I can use -predict- to calculate baseline hazard after estimating a -cloglog- model? I ask this because in an earlier post (attached below) there was an answer to a similar question where it is said that one has to exponentiate the xb (linear combination) because -cloglog- models the log hazard.

I thought that -cloglog- models the hazard as follows

h = 1-exp(-exp(xb)).

Whereas the previous post suggested using

h = exp(xb) to calculate the baseline hazard.

I will be happy if anyone can elaborate on this. Is this fine to use -predict- which I assume gives h = 1-exp(-exp(xb)) for baseline hazard (assuming, of course, zero or mean values for other covariates)?

My other question is that -predict- gives very similar baseline hazard for cloglog and xtcloglog. I used -predict hxt, pu0- for xtcloglog (see syntax below). The estimated coefficients are also similar. I am afraid that I am missing something. I appreciate any elaboration on this too. The estimated baseline hazeds in the two models are:

.  Cloglog Xtcloglog
1 0.154997 0.154974
2 0.265045 0.264967
3 0.275298 0.275166
4 0.561485 0.561224
5 0.734756 0.734525
6 0.748665 0.748422
I used the following syntax to calculate the baseline hazards.

sysuse cancer, clear
gen id = _n

expand studytime
bysort id : gen month = _n

// the addition (month) makes sure that _n==_N means
// the last month for that individual
bysort id (month) : gen byte dead = died==1 & _n==_N
lab var dead "binary depvar for discrete hazard model"

// further trick to safe you some typing
gen halfyr = ceil(month/6)
ta halfyr ,ge(dur)
replace halfyr = 6 if halfyr == 7
replace dur6 = dur6 + dur7
drop dur7

// when looking at the baseline hazard you need to make
// sure it refers to a meaningful group by making sure
// that the value 0 for all your explanatory variables 
// refer to a meaningful value within the range of the
// data, here I centered age at 50 years.
gen c_age = age -- 50

cloglog dead drug c_age dur1 dur2 dur3 dur4 dur5 dur6, ///
        nocons nolog
preserve
replace drug = 0
replace c_age = 0
predict h
tab halfyr, summarize(h) means
restore

xtcloglog dead drug c_age dur1 dur2 dur3 dur4 dur5 dur6, nolog nocons i(id)
// for the baseline here I assume zero random effect

preserve
replace drug = 0
replace c_age = 0
predict hxt, pu0
tab halfyr, summarize(hxt) means
restore

Thanks a lot,
Melaku


/////////////////////////////////////
The previous post about similar question:

Re: st: Recovering the discrete time (interval) baseline hazard function
________________________________________

Prev by Date: Re: st: Resampling and compare full sample with subsamples
Next by Date: [no subject]
Previous by thread: st: Maximum number of instruments
Next by thread: st: Baseline hazard in discrete time hazards model
Index(es):
- Date
- Thread