|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Re: Suggestion on Cox and left truncation
Copy again to Statalist
I'm afraid there is not much you can do, since your design does not
record censored durations. Consequently, only the shortest spells are
observed (I abstract from the left truncation problem). Running -stcox-
on your data will give biased estimates, the size of the bias depending
on the relative length of a typical wait versus the length of the
observation window (i.e. on the number of individuals excluded from the
dataset by your design)
In the following example, I draw from a weibul distribution with a
coefficient of 1 for the explanatory variable. The beginning date is
uniform between zero and a scalar names "span" (span=2 in the example
below). The end date is begin+duration. If the end date is greater than
"span", I censor the duration accordingly. I then estimate a cox model
with and without the censored observations.
As you can see, the second estimation gives biased estimates (while the
first is not, remember the true parameter is 1). You can modify the
value of "span" to see how the bias varies as the observation windows
gets bigger while the data generating process is kept constant.
/*----------------------------------------------------------*/
clear
/// code to draw from a weibull
/// with shape parameter alpha
/// and scale parameter lambda
cap prog drop draweib
program define draweib
syntax newvarlist [if] [in] , LAmbda(string) ALpha(string) [double]
tokenize "`varlist'"
while "`1'"!="" {
tempname vlambda
tempvar `vlambda'
gen `vlambda' = exp(ln(`lambda')/`alpha')
g `double' `1' =((log(1/uniform()))^(1/`alpha'))/`vlambda' `if' `in'
mac shift
}
end
set obs 10000
g x=runiform()
draweib dur, alpha(1) lambda(exp(x))
scalar span=2
g beg=runiform()*span
g end=beg+dur
g fail=end<=span
replace end=span if end>span
replace dur=end-beg
stset dur, f(fail)
stcox x, nohr
keep if fail
stcox x, nohr
/*----------------------------------------------------------*/
[email protected] wrote:
Quoting Antoine Terracol <[email protected]>:
Dear Antoine,
thanks a lot for your reply.
Our study records only the surgeries performed in the observation
window, i.e. between 2006-2008. This is due to the fact that hospital
statistics first keep records of the date of surgery and then add the
date of registration. We do not have a registration date without the
date of surgery. So all individuals have yet been through surgery at the
end of the observation window by study design.
Accordingly, the following types of individuals are included in our
sample: 0,1,3 while
(2) diagnosed<2006, surgery>2008 => NOT observed
(4) diagnosed>2006, surgery>2008 => NOT observed
So, what do you think?
Many thanks.
Giuliana
Dear Giuliana,
I'm copy-ing this reply to the Statalist for anyone to comment on it
let me rephrase your setup to see if I got it right.
all observed exits take place between 2006 and 2008. Some individuals
are diagnosed after 2006, some before. I assume that some individuals do
not exit (i.e. have not yet been through surgery at the end of the
observation window). the following types of individuals can be defined:
(0) diagnosed<2006, surgery<2006 => not observed
(1) diagnosed<2006, surgery in [2006,2008] => observed, left-truncated
with exit
(2) diagnosed<2006, surgery>2008 => observed, left truncated and
right-censored
(3) diagnosed>2006, exit in [2006,2008] => observed, no left truncation,
exit
(4) diagnosed>2006, surgery>2008 => observed, no left-truncation but
right censoring
If your design allows types (1) to (4) to be included in your dataset,
then your -stset- looks ok, although I think there is no need for the
-time0()- option
If your design is such that type-(2) individuals cannot be included in
your dataset (for example because you record only the registrations or
surgeries performed in the observation window), then individuals
diagnosed before 2006 will be observed because their spells are long
enough to end after 2006, but short enough to end before 2008. In this
case your sample will be biased, and I see no easy way to correct the
likelihood within the -st- suite. In this case I would drop the
individuals diagnosed<2006, and -stset- the data without the -enter()-
option.
Best,
Antoine
[email protected] wrote:
Dear Dr Terracol,
I would like to ask something about my work after having seen some of
your comments on statalist forum.
I am studying the effect of education on WAITED times for
elective surgery using hospital individual level data and applying Cox
estimation. Date of surgeries are observed between 2006-2008. I have the
following key variables:
date of registration (onset of the risk)
date of surgery.
So timeatrisk=date of surgery - date of registration (i.e. waitED time)
However, some individuals became at risk before 2006 (start of
our OBSERVATIONAL WINDOW), i.e. the date of registration is before
2006. This
because of our study design which is retrospect. How I can treat such
individuals when I stset data to perform cox regression? Is this the
case of
left truncation?
I thought the following:
stset date_of_surgery, origin(date_of_registration) enter(time
mdy 81,1,2006)) failure(surgery) time0(date_of_registration)
Thank you very much for your help.
Kind Regards.
Giuliana De Luca
----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/