Title | Failure time, censoring time, and entry time in the Cox model | |
Author | William Gould, StataCorp |
Consider a subject who enters at t0 and dies at t1.
Stata interprets the interval as [t0,t1)—closed on the left, open on the right—or equivalently, as t such that t0 <= t < t1. By that logic, t0=t1=0 makes no sense since it results in the interval [0,0)—the interval would be 0 <= t < 0.
Mechanically, when events happen at the same time, Stata interprets them as really occurring in the order of failures, censorings, and finally entries. Thus, a subject who entered and died at the same time would first die and then enter the sample.
What you probably mean by this is that subjects entered at time 0 and then, almost instantly, died. If this is the case, you need to change your death times to 0+epsilon, where epsilon is some small number.
Choose epsilon so that 0+epsilon is less than the time of the first death after time 0. When fitting a Cox model, any value of epsilon that meets that constrain will lead to the same results.
Say the earliest failure among those failing after time 0 is time 1. You could
. replace time = .1 if time==0 . stset ... . stcox ...
stcox would report the same results if we changed the time 0 deaths to occur at time .2:
. replace time = .2 if time==0 . stset ... . stcox ...
The Cox proportional hazards model is sensitive only to the ordering of the failure events, so as long as we keep the earliest failure events occurring first, the results will remain unchanged.
This problem deals with situations where you explicitly specify both the entry and the exit times:
. stset t1, failure(outcome) enter(t0) ...
The solution is
. replace t0 = t0 - .1 if t0==t1 . stset t1, failure(outcome) enter(t0) ...
where .1 is like epsilon in the previous case; it is a small number that does not change the ordering of events.
We shift the entry time back, not the failure time forward. To understand why, let’s say
Subject A enters and dies at 5.
Subject B enters at 0 at dies at 5.
This means that subjects A and B died at the same time. Thus we must keep them dying at the same time and so shift the entry time of subject A to be just a little before time 5. If we instead shifted subject A’s death time forward a little bit, we would be saying that subject A died after B.
Stata orders the events occurring at the same time as failures, then censorings, then entries. Think about the following:
Subject C: enters at 0, censored at 5
Subject D: enters at 0, fails at 5
Could subject C have died at time 5? That is, was subject C in the risk
pool when D died?
Answer: yes. Here is how it happened:
At time 0:
First, deaths (remove from risk pool): none.
Then, censorings (remove from risk pool): none.
Finally, entries (add to risk pool): C and D.
At time 5:
First, deaths (remove from risk pool): D
Then, censorings (remove from risk pool): C
Finally, entries (add to risk pool): none
Therefore, C was in the risk pool when D died.
Is that what we meant when we wrote that Subject C was censored at 5 and D died at 5? If what you mean is Subject C could not have died at time 5, you need to change Subject C’s censoring time to be 5 minus a little, to make it, say, 4.9.