Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: stcox and xi in Stata 12.1


From   Steve Samuels <[email protected]>
To   [email protected]
Subject   Re: st: stcox and xi in Stata 12.1
Date   Sun, 24 Feb 2013 14:35:44 -1000

This doesn't seem like an error to me. When you omit a group, you remove observations from the risk sets for the remaining groups. The partial likelihood equations depend on the composition of these risk sets.  Thus the estimated hazard ratio for comparing the remaining groups can differ from that computed when all groups are present.


Steve


> Fri, 22 Feb 2013 17:16:41 +0000
> From: Benno Kreuels <[email protected]>
> Subject: st: stcox and xi in Stata 12.1 
> 
> Dear statalist,
> 
> I have encountered a potential problem with the -stcox- command in Stata 12.1.
> I am trying to fit a cox regression for a dataset with single-failure. The explanatory variable has a total of three categories. The problem can be replicated using one of the example datasets provided online (webuse leukemia) and is as follows:
> 
> I stet the data by typing 
> . stset weeks, failure(relapse)
> 
> This gives me the output:
> 
>   failure event:  relapse != 0 & relapse < .
> obs. time interval:  (0, weeks]
> exit on or before:  failure
> 
> - ------------------------------------------------------------------------------
>       42  total obs.
>        0  exclusions
> - ------------------------------------------------------------------------------
>       42  obs. remaining, representing
>       30  failures in single record/single failure data
>      541  total analysis time at risk, at risk from t =         0
>                             earliest observed entry t =         0
>                                  last observed exit t =        35
> 
> 
> I then fit a cox-model using i.wbc3cat as an explanatory variable and obtain the following output:
> 
> 
> . xi:stcox i.wbc3cat 
> i.wbc3cat         _Iwbc3cat_1-3       (naturally coded; _Iwbc3cat_1 omitted)
> 
>         failure _d:  relapse
>   analysis time _t:  weeks
> 
> Iteration 0:   log likelihood =  -93.98505
> Iteration 1:   log likelihood =  -82.79096
> Iteration 2:   log likelihood = -82.109332
> Iteration 3:   log likelihood = -82.100544
> Iteration 4:   log likelihood = -82.100543
> Refining estimates:
> Iteration 0:   log likelihood = -82.100543
> 
> Cox regression -- Breslow method for ties
> 
> No. of subjects =           42                     Number of obs   =        42
> No. of failures =           30
> Time at risk    =          541
>                                                   LR chi2(2)      =     23.77
> Log likelihood  =   -82.100543                     Prob > chi2     =    0.0000
> 
> - ------------------------------------------------------------------------------
>          _t | Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
> - -------------+----------------------------------------------------------------
> _Iwbc3cat_2 |   3.499543   2.090597     2.10   0.036     1.085202    11.28527
> _Iwbc3cat_3 |   14.20711   8.940021     4.22   0.000     4.138811    48.76813
> - ------------------------------------------------------------------------------
> 
> 
> However, if I go on and restrict the model to only category 1 and 2 of wbc3cat I get a different estimate of the HR, P-value and 95% CI:
> 
> . xi:stcox i.wbc3cat if wbc3cat<3
> i.wbc3cat         _Iwbc3cat_1-3       (naturally coded; _Iwbc3cat_1 omitted)
> 
>         failure _d:  relapse
>   analysis time _t:  weeks
> 
> note: _Iwbc3cat_3 omitted because of collinearity
> Iteration 0:   log likelihood = -37.480485
> Iteration 1:   log likelihood = -35.003619
> Iteration 2:   log likelihood =  -35.00193
> Iteration 3:   log likelihood =  -35.00193
> Refining estimates:
> Iteration 0:   log likelihood =  -35.00193
> 
> Cox regression -- Breslow method for ties
> 



> No. of subjects =           25                     Number of obs   =        25
> No. of failures =           14
> Time at risk    =          431
>                                                   LR chi2(1)      =      4.96
> Log likelihood  =    -35.00193                     Prob > chi2     =    0.0260
> 
> - ------------------------------------------------------------------------------
>          _t | Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
> - -------------+----------------------------------------------------------------
> _Iwbc3cat_2 |   3.515159   2.105749     2.10   0.036     1.086512    11.37249
> _Iwbc3cat_3 |          1  (omitted)
> - ------------------------------------------------------------------------------
> 
> The difference in this dataset is not very large. However in the data I am using the HR changes from 2.09 to 1.89 and also the p-value and the confidence interval change considerably. I do not understand why this happens. Using -strate- to calculate the rates  gives me exactly the same results for the following commands (for the category included in both):
> 
> . strate wbc3cat, per(365.25) 
> and 
> . strate wbc3cat if wbc3cat<3, per(365.25) 
> 
> and there is hardly any difference between  the results if I use a poisson regression by typing:
> 
> streg i.wbc3cat, dist(exp)
> or 
> streg i.wbc3cat if wbc3cat<3, dist(exp)
> 
> I have tried finding a solution in the archives and in the stata manual. I am afraid that I might have some misconception about the way a cox-regression model is fitted as I am not a statistician. If that is the case, I would be grateful if someone could tell me where to find a good (and simple) explanation on how this help me with this problem.
> 
> Thanks in advance!
> 
> Benno Kreuels

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index