| 
    
 |   | 
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: Set up multiple failure data with interval censoring
| 
From | 
 
"Benigno Rodriguez G., MD" <[email protected]> | 
| 
To | 
 
[email protected] | 
| 
Subject | 
 
st: Set up multiple failure data with interval censoring | 
| 
Date | 
 
Wed, 07 Feb 2007 21:56:57 -0500 | 
Hi, all:
I have a dataset where subjects were seen at the time of an 
intervention and 9 times thereafter. The failure event is a cell 
count greater than 200 after the intervention. All subjects had a 
cell count below 200 at baseline (i.e., no left censoring). Some 
covariates include baseline cell count (w0 below) and a dichotomous 
"region" variable. Failure  can occur one time, multiple times, or 
not at all for each individual during follow up. The question is 
whether region is associated with time to failure, and secondarily, 
estimating overall time spent with a cell count over 200. Time is 
measured in weeks.
I found the article by Mario Cleves (STB-49, ssa13) incredibly 
useful, and to my mind, the visits in these subjects are closely 
spaced enough that I would feel comfortable treating time as 
continuous. But one feature of the data that I think makes it 
necessary to treat is as interval censored is the fact that an 
individual is at risk only while having a cell count below 200, and 
this can happen intermittently during follow up.
The data look like this:
id      region  w0      w2      w4      w8      w12     w16     w24 
  w32     w40     w48
1       2       96      213     211     275     207     295     275 
  388     452     349
2       1       113     355     302     251     254     230     167 
  162     150     108
3       2       125     138     146     166     113     131     134 
  146     146     249
4       1       126     291     282     339     409     330     198 
  341     260     201
5       1       88      197     229     186     163             257 
  204     245     308
Replacing the above counts with just a status indicator:
id      region  w0      w2      w4      w8      w12     w16     w24 
  w32     w40     w48
1       2       96      1       1       1       1       1       1 
  1       1       1
2       1       113     1       1       1       1       1       0 
  0       0       0
3       2       125     0       0       0       0       0       0 
  0       0       1
4       1       126     1       1       1       1       1       0 
  1       1       1
5       1       88      0       1       0       0               1 
  1       1       1
Several features of the data make it unclear how to set up the 
dataset: (a) id=1 has the event at each time point and is therefore 
not at risk after week 2; (b) id=2 only becomes at risk at week 24; 
(c) id=3 only fails on the day of the last observation; (d) id=4 
becomes at risk at week 24, but then again is no longer at risk at 
week 34  AND fails at the end of follow up; (e) id=5 has a missing 
observation at week 16.
My questions are: 1) Which of the approaches nicely reviewed by 
Cleves would be recommended here, if any? (and if none, can you 
suggest an alternative approach); and 2) Could anybody suggest how to 
set up the data to account for the above peculiarities of these records?
Thanks,
BENIGNO RODRIGUEZ G., MD
Assistant Professor of Medicine
Case Western Reserve University 
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/