Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: A Question Regarding the Cox Regression and Projected Survival Rates

From	Yuval Arbel <[email protected]>
To	[email protected]
Subject	st: A Question Regarding the Cox Regression and Projected Survival Rates
Date	Fri, 21 Oct 2011 11:27:03 +0200
Dear statalist users,


My question relates to the coefficients of the variables:
"mean_reduct" and "max_red" in the output appended below (immediately
after the question)


Note, that compared to the coefficient of "max_red" (19.26x(10^(-2))),
the coefficient of "mean reduct" (3.53x(10^(-2))) is approx. 5 times
smaller. Needless to say that both coefficients are highly
significant.


According to my best understanding of the STATA manual, we anticipate
a much bigger increase in the hazard to survival when everything else
is equal, for a 1-unit increase in "max_red" instead of "mean_reduct"


Yet, when I'm trying to translate these outcomes to projected survival
rates, they seem to be inconsistent with the above interpretation of
the coefficients. As you can see from the output below, I used the
option "basesurv" to construct two vectors of projected survival rates
for the sample mean (and for max_red=mean_red=10). The name of these
vectors are  "max_omit" and "self_omit", which correspond to
"mean_reduct" and "max_red" in the regression model. I made sure that
these vectors are obtained from a Cox regression with the same
coefficients. Finally, I collapsed the mean of the survival rates into
the mean of 103 sample-periods (there are in fact 114 sample periods,
but failures start from period 12).


Note, that the mean survival rate of "max_omit" across all sample
periods is 78.28x(10^(-2)) and of "self_omit" is 99.51x(10^(-2)). This
is precisely the opposite from what I would anticipate from the Cox
regression. The average projected survival rate across all periods of
"self_omit" should be smaller then 78%. Moreover, in the last period
of the sample, the projected survival rate of "max_omit" is 0% and
projected survival rate of "self_omit" is 49.99%!!! Again, this stands
in contrast to the Cox regression outcomes


My question is: what am I missing here? how can I explain this
apparent inconsistency?



Yours sincerely,

Yuval



. do "G:\public housing\increasing_experiment_average.do"

. clear

. clear matrix

. set memory 500m
(512000k)

. set matsize 800

. use "g:\public housing\test_sample_May_07_Bought.dta", clear


. stcox mean_reduct reductcurrent_mean_reduct rent_net8
diff_stdmadadarea permanentincomeestimate82 diff_mortgage a
> ppreciation,nohr

         failure _d:  fail == 1
   analysis time _t:  time_index
                  id:  appt

Iteration 0:   log likelihood = -78368.249
Iteration 1:   log likelihood = -74721.874
Iteration 2:   log likelihood = -74566.501
Iteration 3:   log likelihood = -74561.567
 Iteration 4:   log likelihood = -74561.555
Refining estimates:
Iteration 0:   log likelihood = -74561.555

Cox regression -- Breslow method for ties

No. of subjects =         9547                     Number of obs   =    499393
 No. of failures =         9547
Time at risk    =       547035
                                                   LR chi2(7)      =   7613.39
Log likelihood  =   -74561.555                     Prob > chi2     =    0.0000

------------------------------------------------------------------------------
          _t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
 mean_reduct |   .0353358   .0005278    66.94   0.000     .0343012    .0363703
reductcurr~t |   .0221957   .0005134    43.23   0.000     .0211894    .0232019
   rent_net8 |   .0025506   .0001655    15.41   0.000     .0022263    .0028749
 diff_stdma~a |  -.4642809   .0446886   -10.39   0.000    -.5518688   -.3766929
permanent~82 |  -.0004675   .0000689    -6.79   0.000    -.0006025   -.0003325
diff_mortg~e |  -6.430141   .8913818    -7.21   0.000    -8.177217   -4.683064
 appreciation |   9.629971   3.161657     3.05   0.002     3.433237     15.8267
------------------------------------------------------------------------------

.
. gen max_red=0

. replace max_red=75 if time_index>=0 & time_index<=14
 (127541 real changes made)

. replace max_red=95 if time_index>=15 & time_index<=93
(360663 real changes made)

. replace max_red=90 if time_index>=94 & time_index<=95
(3547 real changes made)

. replace max_red=92 if time_index>=96 & time_index<=114
(16047 real changes made)

.
. gen  reductcurrent_max_reduct=reduct_per-max_red

. stcox max_red reductcurrent_max_reduct rent_net8 diff_stdmadadarea
permanentincomeestimate82 diff_mortgage apprec
 > iation,nohr

         failure _d:  fail == 1
   analysis time _t:  time_index
                 id:  appt

Iteration 0:   log likelihood = -78368.249
Iteration 1:   log likelihood = -74893.557
 Iteration 2:   log likelihood = -74745.467
Iteration 3:   log likelihood = -74741.532
Iteration 4:   log likelihood = -74741.525
Refining estimates:
Iteration 0:   log likelihood = -74741.525

Cox regression -- Breslow method for ties

No. of subjects =         9547                     Number of obs   =    499393
No. of failures =         9547
Time at risk    =       547035
                                                   LR chi2(7)      =   7253.45
 Log likelihood  =   -74741.525                     Prob > chi2     =    0.0000

------------------------------------------------------------------------------
          _t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     max_red |   .1925865   .0203524     9.46   0.000     .1526964    .2324765
red~x_reduct |   .0288315   .0004167    69.19   0.000     .0280148    .0296482
    rent_net8 |   .0027994   .0001633    17.15   0.000     .0024794    .0031194
diff_stdma~a |   -.482652   .0435363   -11.09   0.000    -.5679816   -.3973225
permanent~82 |  -.0003909   .0000691    -5.65   0.000    -.0005264   -.0002554
 diff_mortg~e |  -6.445779   .8805713    -7.32   0.000    -8.171667   -4.719891
appreciation |   5.424572   3.212262     1.69   0.091    -.8713461    11.72049
------------------------------------------------------------------------------

. drop diff_stdmadadarea11 rent_net11 diff_mortgage11
permanentincomeestimate211 appreciation10011

.
.
. gen rent_net11=rent_net8-60.45422

. gen diff_stdmadadarea11=diff_stdmadadarea-4.92*(10^(-6))
 (1 missing value generated)

.
. gen permanentincomeestimate211=permanentincomeestimate82-1107.764

.
. gen diff_mortgage11=diff_mortgage+.000497


. gen appreciation10011=appreciation-.0016098

.
. gen max_reduct_actual1=max_red-10


. gen reductcurrent_max_actual1=reductcurrent_max_reduct

. stcox max_reduct_actual1 reductcurrent_max_actual1
diff_stdmadadarea11 rent_net11 diff_mortgage11 permanentincome
 > estimate211 appreciation10011,nohr

         failure _d:  fail == 1
   analysis time _t:  time_index
                 id:  appt

Iteration 0:   log likelihood = -78368.249
Iteration 1:   log likelihood = -74893.557
 Iteration 2:   log likelihood = -74745.467
Iteration 3:   log likelihood = -74741.532
Iteration 4:   log likelihood = -74741.525
Refining estimates:
Iteration 0:   log likelihood = -74741.525

Cox regression -- Breslow method for ties

No. of subjects =         9547                     Number of obs   =    499393
No. of failures =         9547
Time at risk    =       547035
                                                   LR chi2(7)      =   7253.45
 Log likelihood  =   -74741.525                     Prob > chi2     =    0.0000

------------------------------------------------------------------------------
          _t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
max_reduct~1 |   .1925865   .0203524     9.46   0.000     .1526964    .2324765
reductcur~l1 |   .0288315   .0004167    69.19   0.000     .0280148    .0296482
 diff_stdm~11 |   -.482652   .0435363   -11.09   0.000    -.5679815   -.3973225
  rent_net11 |   .0027994   .0001633    17.15   0.000     .0024794    .0031194
diff_mort~11 |  -6.445778   .8805713    -7.32   0.000    -8.171666    -4.71989
 permanen~211 |  -.0003909   .0000691    -5.65   0.000    -.0005264   -.0002554
apprec~10011 |   5.424572   3.212262     1.69   0.091    -.8713462    11.72049
------------------------------------------------------------------------------

. predict self_omit,basesurv
(8405 missing values generated)

.
. gen  mean_reduct_actual1=mean_reduct-10

. drop   reductcurrent_mean_reduct1

. gen reductcurrent_mean_reduct1=reductcurrent_mean_reduct


. stcox mean_reduct_actual1 reductcurrent_mean_reduct1
diff_stdmadadarea11 rent_net11 diff_mortgage11 permanentinco
> meestimate211 appreciation10011,nohr

         failure _d:  fail == 1
   analysis time _t:  time_index
                  id:  appt

Iteration 0:   log likelihood = -78368.249
Iteration 1:   log likelihood = -74721.874
Iteration 2:   log likelihood = -74566.501
Iteration 3:   log likelihood = -74561.567
 Iteration 4:   log likelihood = -74561.555
Refining estimates:
Iteration 0:   log likelihood = -74561.555

Cox regression -- Breslow method for ties

No. of subjects =         9547                     Number of obs   =    499393
 No. of failures =         9547
Time at risk    =       547035
                                                   LR chi2(7)      =   7613.39
Log likelihood  =   -74561.555                     Prob > chi2     =    0.0000

------------------------------------------------------------------------------
          _t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
mean_redu~l1 |   .0353358   .0005278    66.94   0.000     .0343012    .0363703
reductcur~t1 |   .0221957   .0005134    43.23   0.000     .0211894    .0232019
diff_stdm~11 |  -.4642809   .0446886   -10.39   0.000    -.5518688   -.3766929
   rent_net11 |   .0025506   .0001655    15.41   0.000     .0022263    .0028749
diff_mort~11 |   -6.43014   .8913818    -7.21   0.000    -8.177217   -4.683064
permanen~211 |  -.0004675   .0000689    -6.79   0.000    -.0006025   -.0003325
 apprec~10011 |   9.629971   3.161657     3.05   0.002     3.433236     15.8267
------------------------------------------------------------------------------

. predict max_omit,basesurv
(8405 missing values generated)

.
. gen diff=reductcurrent_mean_reduct-max_reduct

. stcox mean_reduct_actual1 max_reduct_actual1 diff
diff_stdmadadarea11 rent_net11 diff_mortgage11 permanentincomee
> stimate211 appreciation10011,nohr

         failure _d:  fail == 1
   analysis time _t:  time_index
                 id:  appt

Iteration 0:   log likelihood = -78368.249
Iteration 1:   log likelihood = -74694.532
Iteration 2:   log likelihood = -74538.881
 Iteration 3:   log likelihood = -74533.372
Iteration 4:   log likelihood = -74533.352
Iteration 5:   log likelihood = -74533.352
Refining estimates:
Iteration 0:   log likelihood = -74533.352

Cox regression -- Breslow method for ties

No. of subjects =         9547                     Number of obs   =    499393
No. of failures =         9547
Time at risk    =       547035
                                                   LR chi2(8)      =   7669.79
 Log likelihood  =   -74533.352                     Prob > chi2     =    0.0000

------------------------------------------------------------------------------
          _t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
mean_redu~l1 |   .0352556    .000528    66.77   0.000     .0342207    .0362904
max_reduct~1 |   .1713769    .020039     8.55   0.000     .1321011    .2106527
         diff |   .0223149   .0005149    43.34   0.000     .0213057     .023324
diff_stdm~11 |  -.4692028   .0457971   -10.25   0.000    -.5589636   -.3794421
  rent_net11 |   .0025795   .0001659    15.55   0.000     .0022543    .0029047
 diff_mort~11 |  -7.166604    .947463    -7.56   0.000    -9.023597    -5.30961
permanen~211 |  -.0004599   .0000689    -6.67   0.000    -.0005949   -.0003248
apprec~10011 |   9.514355   3.162538     3.01   0.003     3.315895    15.71281
------------------------------------------------------------------------------

. predict full,basesurv
(8405 missing values generated)

.
. collapse (mean) full self_omit max_omit if fail==1, by(time_index)

. gen t=_n

.
. ttest  max_omit=0

One-sample t test
------------------------------------------------------------------------------
Variable |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
max_omit |     103    .7828852    .0170734    .1732759    .7490202    .8167502
------------------------------------------------------------------------------
    mean = mean(max_omit)                                         t =  45.8541
Ho: mean = 0                                     degrees of freedom =      102

    Ha: mean < 0                 Ha: mean != 0                 Ha: mean > 0
  Pr(T < t) = 1.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 0.0000

. ttest  self_omit=0

One-sample t test
------------------------------------------------------------------------------
Variable |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
self_o~t |     103    .9951453    .0048544    .0492665    .9855167    1.004774
------------------------------------------------------------------------------
    mean = mean(self_omit)                                        t = 204.9998
Ho: mean = 0                                     degrees of freedom =      102

    Ha: mean < 0                 Ha: mean != 0                 Ha: mean > 0
 Pr(T < t) = 1.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 0.0000

. ttest max_omit=self_omit

 Paired t test
------------------------------------------------------------------------------
Variable |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
max_omit |     103    .7828852    .0170734    .1732759    .7490202    .8167502
self_o~t |     103    .9951453    .0048544    .0492665    .9855167    1.004774
---------+--------------------------------------------------------------------
    diff |     103   -.2122602    .0155096    .1574049   -.2430233    -.181497
------------------------------------------------------------------------------
     mean(diff) = mean(max_omit - self_omit)                      t = -13.6858
  Ho: mean(diff) = 0                              degrees of freedom =      102

 Ha: mean(diff) < 0           Ha: mean(diff) != 0           Ha: mean(diff) > 0
 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000


. ttest full=0

One-sample t test
------------------------------------------------------------------------------
Variable |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
    full |     103    .9951447    .0048544    .0492666    .9855161    1.004773
------------------------------------------------------------------------------
    mean = mean(full)                                             t = 204.9992
 Ho: mean = 0                                     degrees of freedom =      102

    Ha: mean < 0                 Ha: mean != 0                 Ha: mean > 0
 Pr(T < t) = 1.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 0.0000

. ttest full=self_omit

Paired t test
------------------------------------------------------------------------------
Variable |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
    full |     103    .9951447    .0048544    .0492666    .9855161    1.004773
self_o~t |     103    .9951453    .0048544    .0492665    .9855167    1.004774
---------+--------------------------------------------------------------------
    diff |     103   -6.44e-07    6.74e-08    6.84e-07   -7.78e-07   -5.10e-07
------------------------------------------------------------------------------
     mean(diff) = mean(full - self_omit)                          t =  -9.5524
  Ho: mean(diff) = 0                              degrees of freedom =      102

 Ha: mean(diff) < 0           Ha: mean(diff) != 0           Ha: mean(diff) > 0
 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

.
end of do-file

-- 
Dr. Yuval Arbel
School of Business
Carmel Academic Center
4 Shaar Palmer Street, Haifa, Israel
e-mail: [email protected]
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Prev by Date: Re: st: calculating many distances and storing them in many new variables
Next by Date: Re: st: calculating many distances and storing them in many new variables
Previous by thread: st: A Question Regarding the Cox Regression and Projected Survival Rates
Next by thread: re:st: RE: RE: looping a regression, exporting the graphs
Index(es):
- Date
- Thread