Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: QUERY: unrealistic results with stcox (Multiple-record-per-subject survival data)
From
Alessandro Marcon <[email protected]>
To
[email protected]
Subject
st: QUERY: unrealistic results with stcox (Multiple-record-per-subject survival data)
Date
Wed, 17 Jul 2013 11:31:36 +0200
Dear All,
I have repeated cross-sectional (panel) data where "id_pz" is the
patient's unique id, "year" ranges 2009 to 2012, the event of interest
is "decesso_", which stands for death.
Time entering/exiting the study are "t_enter" and "t_exit", respectively.
+--------------------------------------------+
| id_pz year decesso_ t_enter t_exit |
|--------------------------------------------|
1249. | 388 2009 0 17898 18262 |
1250. | 388 2010 0 18263 18627 |
1251. | 388 2011 0 18628 18992 |
1252. | 388 2012 1 18993 19152 |
|--------------------------------------------|
1253. | 389 2009 0 17898 18262 |
1254. | 389 2010 1 18263 18546 |
|--------------------------------------------|
1255. | 390 2012 0 18993 19358 |
|--------------------------------------------|
1256. | 391 2009 0 17898 18262 |
1257. | 391 2010 0 18263 18627 |
1258. | 391 2011 0 18628 18992 |
1259. | 391 2012 0 18993 19358 |
|--------------------------------------------|
1260. | 392 2009 0 17898 18262 |
1261. | 392 2010 0 18263 18627 |
1262. | 392 2011 0 18628 18992 |
1263. | 392 2012 0 18993 19358 |
|--------------------------------------------|
Patients can enter one or more of 4 years of observation (2009-2012)
like this:
.xtdescribe
id_pz: 1, 2, ..., 10998 n =
10998
year: 2009, 2010, ..., 2012 T
= 4
Delta(year) = 1 unit
Span(year) = 4 periods
(id_pz*year uniquely identifies each observation)
Distribution of T_i: min 5% 25% 50% 75% 95% max
1 1 2 4 4 4 4
Freq. Percent Cum. | Pattern
---------------------------+---------
6809 61.91 61.91 | 1111
1017 9.25 71.16 | ...1
810 7.36 78.52 | ..11
520 4.73 83.25 | .111
432 3.93 87.18 | 111.
428 3.89 91.07 | 1...
361 3.28 94.35 | 11..
296 2.69 97.04 | ..1.
183 1.66 98.71 | .1..
142 1.29 100.00 | (other patterns)
---------------------------+---------
10998 100.00 | XXXX
I want to analyse survival of this dynamic cohort. Since I have
"Multiple-record-per-subject survival data", I stset my data like this:
stset t_exit, id(id_pz) failure(decesso_==1) origin(time t_enter)
scale(365)
This is what I get when computing annual rates and testing by Cox model:
. strate year, per(1000)
failure _d: decesso_ == 1
analysis time _t: (t_exit-origin)/365
origin: time t_enter
id: id_pz
Estimated rates (per 1000) and lower/upper bounds of 95% confidence
intervals
(34673 records included in the analysis)
+------------------------------------------------+
| year D Y Rate Lower Upper |
|------------------------------------------------|
| 2009 219 7.9698 27.479 24.070 31.370 |
| 2010 240 8.2815 28.980 25.536 32.889 |
| 2011 281 8.8487 31.756 28.252 35.695 |
| 2012 275 9.1627 30.013 26.667 33.778 |
+------------------------------------------------+
. stcox i.year
failure _d: decesso_ == 1
analysis time _t: (t_exit-origin)/365
origin: time t_enter
id: id_pz
Iteration 0: log likelihood = -9195.495
Iteration 1: log likelihood = -9194.8578
Iteration 2: log likelihood = -9194.8575
Iteration 3: log likelihood = -9194.8575
Refining estimates:
Iteration 0: log likelihood = -9194.8575
Cox regression -- Breslow method for ties
No. of subjects = 10998 Number of obs
= 34673
No. of failures = 1015
Time at risk = 34262.67945
LR chi2(3)
= 1.28
Log likelihood = -9194.8575 Prob > chi2
= 0.7350
------------------------------------------------------------------------------
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf.
Interval]
-------------+----------------------------------------------------------------
year |
2010 | 1.160537 .1907743 0.91 0.365 .8408812 1.601708
2011 | .9716829 .153747 -0.18 0.856 .7125922 1.324976
2012 | 1.028494 .1607485 0.18 0.857 .7571171 1.397141
------------------------------------------------------------------------------
HOWEVER, when I repeat this analyses only including patients who enter
the study in 2009 (baseline==1) this is what I get!
. stset t_exit if baseline==1, id(id_pz) failure(decesso_==1)
origin(time t_enter) scale(365)
id: id_pz
failure event: decesso_ == 1
obs. time interval: (t_exit[_n-1], t_exit]
exit on or before: failure
t for analysis: (time-origin)/365
origin: time t_enter
if exp: baseline==1
------------------------------------------------------------------------------
34673 total obs.
4848 ignored at outset because of -if <exp>-
------------------------------------------------------------------------------
29825 obs. remaining, representing
8086 subjects
879 failures in single failure-per-subject data
29477.81 total analysis time at risk, at risk from t = 0
earliest observed entry t = 0
last observed exit t = 4
. strate year , per(1000)
failure _d: decesso_ == 1
analysis time _t: (t_exit-origin)/365
origin: time t_enter
id: id_pz
Estimated rates (per 1000) and lower/upper bounds of 95% confidence
intervals
(29825 records included in the analysis)
+------------------------------------------------+
| year D Y Rate Lower Upper |
|------------------------------------------------|
| 2009 219 7.9698 27.479 24.070 31.370 |
| 2010 212 7.5069 28.241 24.684 32.310 |
| 2011 242 7.1824 33.694 29.705 38.218 |
| 2012 206 6.8188 30.211 26.355 34.631 |
+------------------------------------------------+
. stcox i.year
failure _d: decesso_ == 1
analysis time _t: (t_exit-origin)/365
origin: time t_enter
id: id_pz
Iteration 0: log likelihood = -7824.2973
Iteration 1: log likelihood = -7822.9333
Iteration 2: log likelihood = -7822.3438
Iteration 3: log likelihood = -7822.0976
Iteration 4: log likelihood = -7822
Iteration 5: log likelihood = -7821.9629
Iteration 6: log likelihood = -7821.949
[omitted]
Iteration 26: log likelihood = -7821.7751 (backed up)
Refining estimates:
Iteration 0: log likelihood = -7821.9409
Iteration 1: log likelihood = -7821.9409
Iteration 2: log likelihood = -7821.9408
[omitted]
Iteration 19: log likelihood = -7821.3159 (backed up)
Cox regression -- Breslow method for ties
No. of subjects = 8086 Number of obs
= 29825
No. of failures = 879
Time at risk = 29477.8137
LR chi2(2)
= 5.96
Log likelihood = -7821.3159 Prob > chi2
= 0.0507
------------------------------------------------------------------------------
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf.
Interval]
-------------+----------------------------------------------------------------
year |
2010 | 1.81e+24 1.01e+30 0.00 1.000
0 .
2011 | 1.26e+13 4.58e+18 0.00 1.000
0 .
2012 | 73.05721 . . . . .
------------------------------------------------------------------------------
The number of events and person-time in 2009 seem to be big enough. The
rates are still reasonable but COX MODEL SEEMS TO FAIL.
Any help will be very much welcome!
Best regards,
Alessandro Marcon
--
Alessandro Marcon, PhD
Unit of Epidemiology & Medical Statistics
Department of Public Health and Community Medicine
University of Verona
Strada Le Grazie 8, 37134 Verona, Italy
tel. +39 045 8027668 fax +39 045 8027154
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/