Dear all,
I am estimating discrete time hazard models using cloglog. The data are
structured as organization-years (similar to person-months) and many of
the covariates are time-varying, i.e. change with each organization-year.
I know only the year in which the failure / event occurred, not the
specific moment or date. I also have many tied events / many failures in
the same year. These characteristics of the data have led me to
discrete-time models.
My question is whether I'm specifying the time variables correctly to
check out different functional forms of the hazard. I want to compare a
constant hazard, a linear increase in the hazard, and a piece-wise
constant exponential model. I have been referring to Stephen Jenkins'
terrific lectures and lessons and found explicit confirmation on how to
write a piece-wise exponential model for discrete-time data, but I'd like
to run the other specifications by you all too.
ALSO a reviewer saw these models and said "but you're not really doing
event-history analyses." Any suggestions for quick explanations of why
discrete-time methods are legitimate and actually more appropriate for
these data? I'd be especially happy to cite recent sociology or political
science articles that use discrete-time analyses, so let me know if you
have an empirical example for me to review and possibly cite.
Back to the first question, here's what I've been running:
Exponential analogue:
cloglog depvar ind1 ind2 ind3...
* model has no covariate that is explicit measure of time / year *
Piece-wise constant exponential analogue:
cloglog depvar years6-15 years16-30 ind3 ind4...
* model has dummy intervals marking certain years (with intervals
defined by theoretical / historical claims) and omits one dummy
period; allows me to investigate disjunctures in hazard associated with
changes in public policies (or other historical events) *
Gompertz analogue:
cloglog depvar year ind2 ind3...
* model has year in it, assume linear increase in risk over time *
Weibull analogue:
cloglog depvar lnyear ind2 ind3...
* model has ln(year) in it, coefficient is about .6 corresponding
to increase over time that levels off slowly *
Look OK?
Many thanks.
Erin Kelly
Assistant Professor of Sociology
University of Minnesota
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/