Not my finest moment there, thank you so much for correcting my
misinterpretations, Maarten. I will use (1-surv_a) as the probability
for failure in each step.
On a sidenote, when working with these predictions, I noticed that I
sometimes get missing predictions from -predict csurv_a, csurv-. Does
anyone know what might be the cause for this? I've left a transcription
below to show the problem. It strikes me as odd that Stata is able to
calculate surv_a but not the cumulative csurv_a.
I have also been experiencing problems doing out-of-sample predictions
of e.g. cox-snell residuals (I did stset the data to include the
validation sample). Maybe the two problems are related.
Cheers,
Steinar
time haz_a surv_a csurv_a _st _t _t0 _d
1 .0006299 .9993703 .9993706 1 1
0 0
2 .0006299 .9993703 .9987423 1 2
1 0
3 .0006299 .9993703 . 1 3
2 0
4 .0006549 .9993452 . 1 4
3 0
5 .0006549 .9993452 . 1 5
4 0
6 .0007081 .9992921 .9980397 1 6
5 0
7 .0007081 .9992921 .9973384 1 7
6 0
8 .0007657 .9992346 .9965819 1 8
7 0
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Maarten Buis
Sent: 29. mai 2007 15:15
To: [email protected]
Subject: st: RE: Validity of hazard predictions in frailtymodels
---- Steinar Fossedal wrote:
> I have fitted an exponential survival model with gamma frailty, and
now
> find myself in a pickle trying to interpret and apply the predictive
> results. Specifically I'm getting predictions of individual hazard
that
> exceeds one. Such estimates are, of course, problematic to use in the
> next step of my analysis.
Actually you have no problem. Hazards are not the same as probabilities,
and can range between 0 and +infinity. The interpretation of a hazard is
the number of times in a unit time that you can on average be expected
to experience the event. Say the event is experiencing a cold. My hazard
for experiencing a cold when time is measured in months is probably less
than 1, but if time is measured in centuries it will clearly be larger
than one (until someone invents the cure for the common cold).
> I reason that this is caused by the multiplicative effect of the
frailty
> parameter on the hazard, and that the model only ensures the validity
of
> the population hazard values - not the unobserved individual's. With
> validity I mean restrictions that ensure the hazard stays between zero
> and one. To me, this seems like an inherent weakness in frailty
models.
> This may not matter much when investigating hazard ratios and
> differences between populations, but it does pose a problem when
making
> individual predictions
Actually the model without frailty component (possibly with robust
standard errors) is the model that correctly looks at population values
of the hazard, while the model with frailty component captures the
hazards
at the individual level (if you believe your model, i.e. the unobserved
component of your model is gamma distributed, uncorrelated with your
observed variables, the effects of the observed variables are correctly
specified, etc. etc. It is no coincidence that robust standard errors
are
called robust, thus implying that other models are less robust.
(However,
I still think that the name robust suggests more robustness than it can
deliver. But that is another issue))
Hope this helps,
Maarten
-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands
visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434
+31 20 5986715
http://home.fsw.vu.nl/m.buis/
-----------------------------------------
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/