Shaun Fultz <[email protected]> asks:
> I have a dataset where subjects (id) are followed in one of two time periods
> (or both time periods) and at some point die or are censored. The data is
> arranged as multiple records per subject. For example
> id group StartTime EndTime outcome
> 1 0 0 350 0
> 1 1 350 500 1
> 2 0 0 200 0
> 2 1 200 600 1
> 3 1 0 300 0
> When I use stset such as
> stset EndTime, id(id) time0(StartTime) failure(outcome)
> I get data like:
> id group StartTime EndTime outcome _t _t0
> 1 0 0 350 0 350 0
> 1 1 350 500 1 500 350
> 2 0 0 200 0 200 0
> 2 1 200 600 1 600 200
> 3 1 0 300 0 300 0
> which looks appropriate, but when I try to graph a KM curve of the second
> time period (group==1) with
> sts graph if group==1
> I get 600 days worth of time, even though the maximum time anyone spends in
> the group 1 time period is 400 days.
> A colleague's suggestion was to treat each record like an individual
> subject, dropping the 'id(id)' parameter from the STSET command, but my
> understanding of the Stata's survival analysis was that it should be able to
> handle multiple records per subject. I've reviewed the manuals and the books
> "Introduction to Survival Analysis using Stata" and have been unable to
> figure this out.
> If this is a problem with the KM curves, did it also occur with my 'stcox if
> group==1' commands?
> Sorry for the long explanation, but any help would be greatly appreciated.
>
Given the way you have -stset- your data, the KM curve given is the correct
one.
> id group StartTime EndTime outcome _t _t0
> 1 0 0 350 0 350 0
> 1 1 350 500 1 500 350
By examining _t0 and _t, we see that the subject with id == 1 became at risk
at analysis time _t0 == 0, changed from group 0 to group 1 at analysis time _t
= 350, and continued at risk and in group 1 until failure at analysis time _t
= 500. The assumption here is that when the subject changed groups, his risk
clock did NOT reset back to zero. This seems reasonable but may not be what
you wanted.
It seems to me that you want the risk clock reset to zero each time a subject
switches groups. For that, try
. gen TimeAtRisk = EndTime - StartTime
. stset TimeAtRisk, failure(outcome)
Note that you do not include -id(id)- in this alternate call to -stset-, but
you would want to cluster on this id variable if fitting a regression model
(such as Cox) and wanted "correct" standard errors.
--Bobby
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/