I welcome comments from anyone but this is probably for the Stata people...
I came across this issue in a real dataset I am analyzing but I have
simplified it here to clarify the point.
Consider the following (contrived) failure-time dataset with 12
observations (6 in each of two experimental groups). All 12 observations
are events (ie, no censored obs):
failure event: fail == 1
obs. time interval: (0, t]
exit on or before: failure
------------------------------------------------------------------------------
12 total obs.
0 exclusions
------------------------------------------------------------------------------
12 obs. remaining, representing
12 failures in single record/single failure data
44 total analysis time at risk, at risk from t = 0
earliest observed entry t = 0
last observed exit t = 8
Then, I use -stsum- and -stci- to look at median survival times for the two
groups. But, for group 2, -stsum- gives me a median of 4 while -stci- gives
me a median of 2. My understanding is that both should be the KM estimate
of the median (which, from the data, should be 2). Is there a discrepancy
in the way the survival times are computed by the two commands?
. stsum, by(group)
failure _d: fail == 1
analysis time _t: t
| incidence no. of |------ Survival time
-----|
group | time at risk rate subjects 25% 50% 75%
---------+---------------------------------------------------------------------
1
| 21 .2857143 6 2 3 5
2
| 23 .2608696 6 2 4 6
---------+---------------------------------------------------------------------
total
| 44 .2727273 12 2 3 6