This is either a Stata question or a stats question,
apologies if it turns out to be the latter.
I am using -streg- to estimate a parametric survival
model with a lognormal distribution. The data represent
readmissions to hospital after a discharge, with readmission
being considered a failure. All patients (n=424,787) are
followed for 30 days, with approximately 25% readmitted
within that time.
-> stset ftime, failure(radm)
failure event: radm != 0 & radm < .
obs. time interval: (0, ftime]
exit on or before: failure
------------------------------------------------------------------------------
424787 total obs.
0 exclusions
------------------------------------------------------------------------------
424787 obs. remaining, representing
95529 failures in single record/single failure data
1.07e+07 total analysis time at risk, at risk from t = 0
earliest observed entry t = 0
last observed exit t = 30
My model has 30+ covariates representing demographic and
comorbid characteristics of the patients. I would like
to get a predicted time to readmission for each patient
in the cohort. I will need to specify shared frailty by
hospital but for now am working with a standard model.
What is bothering me is that while 95,000 patients are
acutally readmitted prior to 30 days, when I calculate
the predicted time to failure:
. predict median, median time
. gen prd30 = median<=30
. count if prd30
30
Why the huge difference? Diagnostics show the lognormal
model fitting quite well, with the residuals falling right
along where they should. Either I don't understand what
the model is doing, or I'm not calculating the correct
predicted value. Should I expect this?
Thanks in advance for any insights.
Jeph
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/