Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: A Question Regarding the Cox Regression and Projected Survival Rates
From
Yuval Arbel <[email protected]>
To
[email protected]
Subject
st: A Question Regarding the Cox Regression and Projected Survival Rates
Date
Thu, 20 Oct 2011 17:20:36 +0200
Dear statalist participants,
My question relates to the coefficients of the variables:
"mean_reduct" and "max_red" in the output appended below (immediately
after the question)
Note, that compared to the coefficient of "max_red" (19.26x(10^(-2))),
the coefficient of "mean reduct" (3.53x(10^(-2))) is approx. 5 times
smaller. Needless to say that both coefficients are highly
significant.
According to my best understanding of the STATA manual, we anticipate
a much bigger increase in the hazard to survival when everything else
is equal, for a 1-unit increase in "max_red" instead of "mean_reduct"
Yet, when I'm trying to translate these outcomes to projected survival
rates, they seem to be inconsistent with the above interpretation of
the coefficients. As you can see from the output below, I used the
option "basesurv" to construct two vectors of projected survival rates
for the sample mean (and for max_red=mean_red=10). The name of these
vectors are "max_omit" and "self_omit", which correspond to
"mean_reduct" and "max_red" in the regression model. I made sure that
these vectors are obtained from a Cox regression with the same
coefficients. Finally, I collapsed the mean of the survival rates into
the mean of 103 sample-periods (there are in fact 114 sample periods,
but failures start from period 12).
Note, that the mean survival rate of "max_omit" across all sample
periods is 78.28x(10^(-2)) and of "self_omit" is 99.51x(10^(-2)). This
is precisely the opposite from what I would anticipate from the Cox
regression. The average projected survival rate across all periods of
"self_omit" should be smaller then 78%. Moreover, in the last period
of the sample, the projected survival rate of "max_omit" is 0% and
projected survival rate of "self_omit" is 49.99%!!! Again, this stands
in contrast to the Cox regression outcomes
My question is: what am I missing here? how can I explain this
apparent inconsistency?
Yours sincerely,
Yuval
. do "G:\public housing\increasing_experiment_average.do"
. clear
. clear matrix
. set memory 500m
(512000k)
. set matsize 800
. use "g:\public housing\test_sample_May_07_Bought.dta", clear
. stcox mean_reduct reductcurrent_mean_reduct rent_net8
diff_stdmadadarea permanentincomeestimate82 diff_mortgage a
> ppreciation,nohr
failure _d: fail == 1
analysis time _t: time_index
id: appt
Iteration 0: log likelihood = -78368.249
Iteration 1: log likelihood = -74721.874
Iteration 2: log likelihood = -74566.501
Iteration 3: log likelihood = -74561.567
Iteration 4: log likelihood = -74561.555
Refining estimates:
Iteration 0: log likelihood = -74561.555
Cox regression -- Breslow method for ties
No. of subjects = 9547 Number of obs = 499393
No. of failures = 9547
Time at risk = 547035
LR chi2(7) = 7613.39
Log likelihood = -74561.555 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
_t | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mean_reduct | .0353358 .0005278 66.94 0.000 .0343012 .0363703
reductcurr~t | .0221957 .0005134 43.23 0.000 .0211894 .0232019
rent_net8 | .0025506 .0001655 15.41 0.000 .0022263 .0028749
diff_stdma~a | -.4642809 .0446886 -10.39 0.000 -.5518688 -.3766929
permanent~82 | -.0004675 .0000689 -6.79 0.000 -.0006025 -.0003325
diff_mortg~e | -6.430141 .8913818 -7.21 0.000 -8.177217 -4.683064
appreciation | 9.629971 3.161657 3.05 0.002 3.433237 15.8267
------------------------------------------------------------------------------
.
. gen max_red=0
. replace max_red=75 if time_index>=0 & time_index<=14
(127541 real changes made)
. replace max_red=95 if time_index>=15 & time_index<=93
(360663 real changes made)
. replace max_red=90 if time_index>=94 & time_index<=95
(3547 real changes made)
. replace max_red=92 if time_index>=96 & time_index<=114
(16047 real changes made)
.
. gen reductcurrent_max_reduct=reduct_per-max_red
. stcox max_red reductcurrent_max_reduct rent_net8 diff_stdmadadarea
permanentincomeestimate82 diff_mortgage apprec
> iation,nohr
failure _d: fail == 1
analysis time _t: time_index
id: appt
Iteration 0: log likelihood = -78368.249
Iteration 1: log likelihood = -74893.557
Iteration 2: log likelihood = -74745.467
Iteration 3: log likelihood = -74741.532
Iteration 4: log likelihood = -74741.525
Refining estimates:
Iteration 0: log likelihood = -74741.525
Cox regression -- Breslow method for ties
No. of subjects = 9547 Number of obs = 499393
No. of failures = 9547
Time at risk = 547035
LR chi2(7) = 7253.45
Log likelihood = -74741.525 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
_t | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
max_red | .1925865 .0203524 9.46 0.000 .1526964 .2324765
red~x_reduct | .0288315 .0004167 69.19 0.000 .0280148 .0296482
rent_net8 | .0027994 .0001633 17.15 0.000 .0024794 .0031194
diff_stdma~a | -.482652 .0435363 -11.09 0.000 -.5679816 -.3973225
permanent~82 | -.0003909 .0000691 -5.65 0.000 -.0005264 -.0002554
diff_mortg~e | -6.445779 .8805713 -7.32 0.000 -8.171667 -4.719891
appreciation | 5.424572 3.212262 1.69 0.091 -.8713461 11.72049
------------------------------------------------------------------------------
. drop diff_stdmadadarea11 rent_net11 diff_mortgage11
permanentincomeestimate211 appreciation10011
.
.
. gen rent_net11=rent_net8-60.45422
. gen diff_stdmadadarea11=diff_stdmadadarea-4.92*(10^(-6))
(1 missing value generated)
.
. gen permanentincomeestimate211=permanentincomeestimate82-1107.764
.
. gen diff_mortgage11=diff_mortgage+.000497
. gen appreciation10011=appreciation-.0016098
.
. gen max_reduct_actual1=max_red-10
. gen reductcurrent_max_actual1=reductcurrent_max_reduct
. stcox max_reduct_actual1 reductcurrent_max_actual1
diff_stdmadadarea11 rent_net11 diff_mortgage11 permanentincome
> estimate211 appreciation10011,nohr
failure _d: fail == 1
analysis time _t: time_index
id: appt
Iteration 0: log likelihood = -78368.249
Iteration 1: log likelihood = -74893.557
Iteration 2: log likelihood = -74745.467
Iteration 3: log likelihood = -74741.532
Iteration 4: log likelihood = -74741.525
Refining estimates:
Iteration 0: log likelihood = -74741.525
Cox regression -- Breslow method for ties
No. of subjects = 9547 Number of obs = 499393
No. of failures = 9547
Time at risk = 547035
LR chi2(7) = 7253.45
Log likelihood = -74741.525 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
_t | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
max_reduct~1 | .1925865 .0203524 9.46 0.000 .1526964 .2324765
reductcur~l1 | .0288315 .0004167 69.19 0.000 .0280148 .0296482
diff_stdm~11 | -.482652 .0435363 -11.09 0.000 -.5679815 -.3973225
rent_net11 | .0027994 .0001633 17.15 0.000 .0024794 .0031194
diff_mort~11 | -6.445778 .8805713 -7.32 0.000 -8.171666 -4.71989
permanen~211 | -.0003909 .0000691 -5.65 0.000 -.0005264 -.0002554
apprec~10011 | 5.424572 3.212262 1.69 0.091 -.8713462 11.72049
------------------------------------------------------------------------------
. predict self_omit,basesurv
(8405 missing values generated)
.
. gen mean_reduct_actual1=mean_reduct-10
. drop reductcurrent_mean_reduct1
. gen reductcurrent_mean_reduct1=reductcurrent_mean_reduct
. stcox mean_reduct_actual1 reductcurrent_mean_reduct1
diff_stdmadadarea11 rent_net11 diff_mortgage11 permanentinco
> meestimate211 appreciation10011,nohr
failure _d: fail == 1
analysis time _t: time_index
id: appt
Iteration 0: log likelihood = -78368.249
Iteration 1: log likelihood = -74721.874
Iteration 2: log likelihood = -74566.501
Iteration 3: log likelihood = -74561.567
Iteration 4: log likelihood = -74561.555
Refining estimates:
Iteration 0: log likelihood = -74561.555
Cox regression -- Breslow method for ties
No. of subjects = 9547 Number of obs = 499393
No. of failures = 9547
Time at risk = 547035
LR chi2(7) = 7613.39
Log likelihood = -74561.555 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
_t | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mean_redu~l1 | .0353358 .0005278 66.94 0.000 .0343012 .0363703
reductcur~t1 | .0221957 .0005134 43.23 0.000 .0211894 .0232019
diff_stdm~11 | -.4642809 .0446886 -10.39 0.000 -.5518688 -.3766929
rent_net11 | .0025506 .0001655 15.41 0.000 .0022263 .0028749
diff_mort~11 | -6.43014 .8913818 -7.21 0.000 -8.177217 -4.683064
permanen~211 | -.0004675 .0000689 -6.79 0.000 -.0006025 -.0003325
apprec~10011 | 9.629971 3.161657 3.05 0.002 3.433236 15.8267
------------------------------------------------------------------------------
. predict max_omit,basesurv
(8405 missing values generated)
.
. gen diff=reductcurrent_mean_reduct-max_reduct
. stcox mean_reduct_actual1 max_reduct_actual1 diff
diff_stdmadadarea11 rent_net11 diff_mortgage11 permanentincomee
> stimate211 appreciation10011,nohr
failure _d: fail == 1
analysis time _t: time_index
id: appt
Iteration 0: log likelihood = -78368.249
Iteration 1: log likelihood = -74694.532
Iteration 2: log likelihood = -74538.881
Iteration 3: log likelihood = -74533.372
Iteration 4: log likelihood = -74533.352
Iteration 5: log likelihood = -74533.352
Refining estimates:
Iteration 0: log likelihood = -74533.352
Cox regression -- Breslow method for ties
No. of subjects = 9547 Number of obs = 499393
No. of failures = 9547
Time at risk = 547035
LR chi2(8) = 7669.79
Log likelihood = -74533.352 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
_t | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mean_redu~l1 | .0352556 .000528 66.77 0.000 .0342207 .0362904
max_reduct~1 | .1713769 .020039 8.55 0.000 .1321011 .2106527
diff | .0223149 .0005149 43.34 0.000 .0213057 .023324
diff_stdm~11 | -.4692028 .0457971 -10.25 0.000 -.5589636 -.3794421
rent_net11 | .0025795 .0001659 15.55 0.000 .0022543 .0029047
diff_mort~11 | -7.166604 .947463 -7.56 0.000 -9.023597 -5.30961
permanen~211 | -.0004599 .0000689 -6.67 0.000 -.0005949 -.0003248
apprec~10011 | 9.514355 3.162538 3.01 0.003 3.315895 15.71281
------------------------------------------------------------------------------
. predict full,basesurv
(8405 missing values generated)
.
. collapse (mean) full self_omit max_omit if fail==1, by(time_index)
. gen t=_n
.
. ttest max_omit=0
One-sample t test
------------------------------------------------------------------------------
Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
max_omit | 103 .7828852 .0170734 .1732759 .7490202 .8167502
------------------------------------------------------------------------------
mean = mean(max_omit) t = 45.8541
Ho: mean = 0 degrees of freedom = 102
Ha: mean < 0 Ha: mean != 0 Ha: mean > 0
Pr(T < t) = 1.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 0.0000
. ttest self_omit=0
One-sample t test
------------------------------------------------------------------------------
Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
self_o~t | 103 .9951453 .0048544 .0492665 .9855167 1.004774
------------------------------------------------------------------------------
mean = mean(self_omit) t = 204.9998
Ho: mean = 0 degrees of freedom = 102
Ha: mean < 0 Ha: mean != 0 Ha: mean > 0
Pr(T < t) = 1.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 0.0000
. ttest max_omit=self_omit
Paired t test
------------------------------------------------------------------------------
Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
max_omit | 103 .7828852 .0170734 .1732759 .7490202 .8167502
self_o~t | 103 .9951453 .0048544 .0492665 .9855167 1.004774
---------+--------------------------------------------------------------------
diff | 103 -.2122602 .0155096 .1574049 -.2430233 -.181497
------------------------------------------------------------------------------
mean(diff) = mean(max_omit - self_omit) t = -13.6858
Ho: mean(diff) = 0 degrees of freedom = 102
Ha: mean(diff) < 0 Ha: mean(diff) != 0 Ha: mean(diff) > 0
Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000
. ttest full=0
One-sample t test
------------------------------------------------------------------------------
Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
full | 103 .9951447 .0048544 .0492666 .9855161 1.004773
------------------------------------------------------------------------------
mean = mean(full) t = 204.9992
Ho: mean = 0 degrees of freedom = 102
Ha: mean < 0 Ha: mean != 0 Ha: mean > 0
Pr(T < t) = 1.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 0.0000
. ttest full=self_omit
Paired t test
------------------------------------------------------------------------------
Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
full | 103 .9951447 .0048544 .0492666 .9855161 1.004773
self_o~t | 103 .9951453 .0048544 .0492665 .9855167 1.004774
---------+--------------------------------------------------------------------
diff | 103 -6.44e-07 6.74e-08 6.84e-07 -7.78e-07 -5.10e-07
------------------------------------------------------------------------------
mean(diff) = mean(full - self_omit) t = -9.5524
Ho: mean(diff) = 0 degrees of freedom = 102
Ha: mean(diff) < 0 Ha: mean(diff) != 0 Ha: mean(diff) > 0
Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000
.
end of do-file
--
Dr. Yuval Arbel
School of Business
Carmel Academic Center
4 Shaar Palmer Street, Haifa, Israel
e-mail: [email protected]
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/