Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: QUERY: unrealistic results with stcox (Multiple-record-per-subject survival data)
From
Steve Samuels <[email protected]>
To
[email protected]
Subject
Re: st: QUERY: unrealistic results with stcox (Multiple-record-per-subject survival data)
Date
Wed, 17 Jul 2013 19:19:30 -0500
See
http://www.stata.com/support/faqs/statistics/stcox-producing-missing-standard-errors/
"4) Covariate does not vary within death event risk sets."
When you subset to those entering in 2009, year does not vary within
risk sets.
Steve
[email protected]
On Jul 17, 2013, at 4:31 AM, Alessandro Marcon wrote:
Dear All,
I have repeated cross-sectional (panel) data where "id_pz" is the
patient's unique id, "year" ranges 2009 to 2012, the event of interest
is "decesso_", which stands for death.
Time entering/exiting the study are "t_enter" and "t_exit", respectively.
> +--------------------------------------------+
> | id_pz year decesso_ t_enter t_exit |
> |--------------------------------------------|
> 1249. | 388 2009 0 17898 18262 |
> 1250. | 388 2010 0 18263 18627 |
> 1251. | 388 2011 0 18628 18992 |
> 1252. | 388 2012 1 18993 19152 |
> |--------------------------------------------|
> 1253. | 389 2009 0 17898 18262 |
> 1254. | 389 2010 1 18263 18546 |
> |--------------------------------------------|
> 1255. | 390 2012 0 18993 19358 |
> |--------------------------------------------|
> 1256. | 391 2009 0 17898 18262 |
> 1257. | 391 2010 0 18263 18627 |
> 1258. | 391 2011 0 18628 18992 |
> 1259. | 391 2012 0 18993 19358 |
> |--------------------------------------------|
> 1260. | 392 2009 0 17898 18262 |
> 1261. | 392 2010 0 18263 18627 |
> 1262. | 392 2011 0 18628 18992 |
> 1263. | 392 2012 0 18993 19358 |
> |--------------------------------------------|
Patients can enter one or more of 4 years of observation (2009-2012)
like this:
> .xtdescribe
>
> id_pz: 1, 2, ..., 10998 n =
> 10998
> year: 2009, 2010, ..., 2012 T
> = 4
> Delta(year) = 1 unit
> Span(year) = 4 periods
> (id_pz*year uniquely identifies each observation)
>
> Distribution of T_i: min 5% 25% 50% 75% 95% max
> 1 1 2 4 4 4 4
>
> Freq. Percent Cum. | Pattern
> ---------------------------+---------
> 6809 61.91 61.91 | 1111
> 1017 9.25 71.16 | ...1
> 810 7.36 78.52 | ..11
> 520 4.73 83.25 | .111
> 432 3.93 87.18 | 111.
> 428 3.89 91.07 | 1...
> 361 3.28 94.35 | 11..
> 296 2.69 97.04 | ..1.
> 183 1.66 98.71 | .1..
> 142 1.29 100.00 | (other patterns)
> ---------------------------+---------
> 10998 100.00 | XXXX
I want to analyse survival of this dynamic cohort. Since I have
"Multiple-record-per-subject survival data", I stset my data like this:
> stset t_exit, id(id_pz) failure(decesso_==1) origin(time t_enter)
> scale(365)
This is what I get when computing annual rates and testing by Cox model:
> . strate year, per(1000)
>
> failure _d: decesso_ == 1
> analysis time _t: (t_exit-origin)/365
> origin: time t_enter
> id: id_pz
>
> Estimated rates (per 1000) and lower/upper bounds of 95% confidence
> intervals
> (34673 records included in the analysis)
>
> +------------------------------------------------+
> | year D Y Rate Lower Upper |
> |------------------------------------------------|
> | 2009 219 7.9698 27.479 24.070 31.370 |
> | 2010 240 8.2815 28.980 25.536 32.889 |
> | 2011 281 8.8487 31.756 28.252 35.695 |
> | 2012 275 9.1627 30.013 26.667 33.778 |
> +------------------------------------------------+
>
>
> . stcox i.year
>
> failure _d: decesso_ == 1
> analysis time _t: (t_exit-origin)/365
> origin: time t_enter
> id: id_pz
>
> Iteration 0: log likelihood = -9195.495
> Iteration 1: log likelihood = -9194.8578
> Iteration 2: log likelihood = -9194.8575
> Iteration 3: log likelihood = -9194.8575
> Refining estimates:
> Iteration 0: log likelihood = -9194.8575
>
> Cox regression -- Breslow method for ties
>
> No. of subjects = 10998 Number of obs
> = 34673
> No. of failures = 1015
> Time at risk = 34262.67945
> LR chi2(3)
> = 1.28
> Log likelihood = -9194.8575 Prob > chi2
> = 0.7350
>
> ------------------------------------------------------------------------------
> _t | Haz. Ratio Std. Err. z P>|z| [95% Conf.
> Interval]
> -------------+----------------------------------------------------------------
> year |
> 2010 | 1.160537 .1907743 0.91 0.365 .8408812 1.601708
> 2011 | .9716829 .153747 -0.18 0.856 .7125922 1.324976
> 2012 | 1.028494 .1607485 0.18 0.857 .7571171 1.397141
> ------------------------------------------------------------------------------
HOWEVER, when I repeat this analyses only including patients who enter
the study in 2009 (baseline==1) this is what I get!
>
> . stset t_exit if baseline==1, id(id_pz) failure(decesso_==1)
> origin(time t_enter) scale(365)
>
> id: id_pz
> failure event: decesso_ == 1
> obs. time interval: (t_exit[_n-1], t_exit]
> exit on or before: failure
> t for analysis: (time-origin)/365
> origin: time t_enter
> if exp: baseline==1
>
> ------------------------------------------------------------------------------
> 34673 total obs.
> 4848 ignored at outset because of -if <exp>-
> ------------------------------------------------------------------------------
> 29825 obs. remaining, representing
> 8086 subjects
> 879 failures in single failure-per-subject data
> 29477.81 total analysis time at risk, at risk from t = 0
> earliest observed entry t = 0
> last observed exit t = 4
>
> . strate year , per(1000)
>
> failure _d: decesso_ == 1
> analysis time _t: (t_exit-origin)/365
> origin: time t_enter
> id: id_pz
>
> Estimated rates (per 1000) and lower/upper bounds of 95% confidence
> intervals
> (29825 records included in the analysis)
>
> +------------------------------------------------+
> | year D Y Rate Lower Upper |
> |------------------------------------------------|
> | 2009 219 7.9698 27.479 24.070 31.370 |
> | 2010 212 7.5069 28.241 24.684 32.310 |
> | 2011 242 7.1824 33.694 29.705 38.218 |
> | 2012 206 6.8188 30.211 26.355 34.631 |
> +------------------------------------------------+
>
>
> . stcox i.year
>
> failure _d: decesso_ == 1
> analysis time _t: (t_exit-origin)/365
> origin: time t_enter
> id: id_pz
>
> Iteration 0: log likelihood = -7824.2973
> Iteration 1: log likelihood = -7822.9333
> Iteration 2: log likelihood = -7822.3438
> Iteration 3: log likelihood = -7822.0976
> Iteration 4: log likelihood = -7822
> Iteration 5: log likelihood = -7821.9629
> Iteration 6: log likelihood = -7821.949
> [omitted]
> Iteration 26: log likelihood = -7821.7751 (backed up)
> Refining estimates:
> Iteration 0: log likelihood = -7821.9409
> Iteration 1: log likelihood = -7821.9409
> Iteration 2: log likelihood = -7821.9408
> [omitted]
> Iteration 19: log likelihood = -7821.3159 (backed up)
>
> Cox regression -- Breslow method for ties
>
> No. of subjects = 8086 Number of obs
> = 29825
> No. of failures = 879
> Time at risk = 29477.8137
> LR chi2(2)
> = 5.96
> Log likelihood = -7821.3159 Prob > chi2
> = 0.0507
>
> ------------------------------------------------------------------------------
> _t | Haz. Ratio Std. Err. z P>|z| [95% Conf.
> Interval]
> -------------+----------------------------------------------------------------
> year |
> 2010 | 1.81e+24 1.01e+30 0.00 1.000
> 0 .
> 2011 | 1.26e+13 4.58e+18 0.00 1.000
> 0 .
> 2012 | 73.05721 . . . . .
> ------------------------------------------------------------------------------
>
The number of events and person-time in 2009 seem to be big enough. The
rates are still reasonable but COX MODEL SEEMS TO FAIL.
Any help will be very much welcome!
Best regards,
Alessandro Marcon
--
Alessandro Marcon, PhD
Unit of Epidemiology & Medical Statistics
Department of Public Health and Community Medicine
University of Verona
Strada Le Grazie 8, 37134 Verona, Italy
tel. +39 045 8027668 fax +39 045 8027154
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/