Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Main effect for time-varying covariate
From
Steve Samuels <[email protected]>
To
[email protected]
Subject
Re: st: Main effect for time-varying covariate
Date
Sun, 15 Sep 2013 12:35:52 -0400
Nicole:
> On Sep 15, 2013, at 1:31 AM, Nicole Boyle wrote:
>
> Steve, thanks for your suggestions!
>
>> But as your primary interest is a factor related to death from infectious disease, I wouldn't give "all other causes" much time or space.
>
> To briefly clarify, my primary outcome of interest is onset of
> infection, and the competing risk is all-cause mortality.
>
I didn't see that nuance. It changes the picture, because it means that you do not have a competing risks problem. For risks to be competing, only one can be observed (Kalbfleisch & Prentice, 2002, p. 248). In your case, infection and death can both be observed. The proper analysis, therefore, is a standard, single-response, survival model.
Steve
Kalbfleisch, J. D, and Ross L Prentice. 2002. The Statistical Analysis of Failure Time Data. Hoboken, N.J: Wiley.
>
>> To sum up my suggestions:
>> 1. Model causal associations (HRs) with -stcox- for infectious disease death
>> 2. Graph hazard functions from -stcox- for infectious disease death.
>
> Agreed on both 1 & 2, but wouldn't it be best to present results from
> both causes (infection and death)?
>
>> 3. Show betas, se's, and p-values for -stcrreg- results. The betas themselves (or the subHRs) are not important, except as they allow you to compare effects of different factors. The tests are important as they help draw conclusions about impact.
>
> I'm comparing the -stcox- results (cause=infection) with the -stcrreg-
> results, and they look about the same. If Z wasn't involved, these
> similarities could indicate that death plays a minimal role as a
> competing event, in that death doesn't seem to be differentially
> associated with the covariate--outcome relationship. But since Z is
> involved, interpreting this is a bit more convoluted.
>
>> 4. Show model-based -stcrreg-"future" CIFs) for infectious disease deaths. For your binary Z, -stcurve- will easily show you CIF for Z = 0 vs Z = 1 for all times. This is essentially the future CIF where the switch takes place at the start.
>
> I've never considered interpreting CIFs in this way, but this makes
> sense, since -stcrreg- seems before geared toward predicting, whereas
> cause-specific -stcox- is seems better at describing relationships
> (within a pseudoreality).
>
>
>
> Generally speaking, I've become less enthusiastic with the
> cause-specific Cox model, at least for the purposes of this study.
> Although the Cox model avoids the Fine-Gray model's multiple records
> per subject assumption (where the value of Z is fixed
> post-failure-by-competing event), I'm not sure if this quality alone
> is redeeming enough:
>
I'm not sure what you mean. -stcox- also can fit time-varying Z with multiple records. In this case, the hazards themselves, and their
patterns, have substantive meaning.
> (1) Interpreting Cox model coefficients when cause=infection
> (censored=death) means imagining a world where infection and death are
> independent of one another. This feels like a MUCH greater leap than
> the Fine-Gray model's assumption about Z.
>
> (2) Interpreting cause-specific Cox coefficients seems more
> appropriate when studying the direct etiology of diseases, which isn't
> really my goal. The Fine-Gray model seems more appropriate for
> predicting individual risk factors which are translatable to clinical
> practice.
>
> I know that no regression method is perfect, but I would like to
> choose the least imperfect one. Although it would be nice to run and
> present all possible models, and generate and present all possible CIF
> plots, who can actually fit all of these results and their
> interpretations into a single publication, especially if you have
> multiple models at the onset (I have 4)?
>
> The following paper advocates this, and although I agree in principle,
> I don't see how this can be practically achieved given space and
> interpretability considerations, perhaps aside from situations with
> merely one model and just a few variables:
>
> Latouche, A., Allignol, A., Beyersmann, J., Labopin, M., & Fine, J. P.
> (2013). A competing risks analysis should report results on all
> cause-specific hazards and cumulative incidence functions. Journal of
> clinical epidemiology, 66(6), 648–53.
> doi:10.1016/j.jclinepi.2012.09.017
>
> I've recently stumbled upon a fantastic paper that elucidates much of
> Cox vs. Fine-Gray. Also, as Adam Olszewski previously discussed in
> this thread, this paper also discusses the flexible parametric
> survival model as a possible unified solution to that "run everything
> and report everything" approach we've been discussing.
>
> Lau, B., Cole, S. R., & Gange, S. J. (2009). Competing risk regression
> models for epidemiologic data. American journal of epidemiology,
> 170(2), 244–56. doi:10.1093/aje/kwp107
>
> Nicole
On Fri, Sep 13, 2013 at 6:33 AM, Steve Samuels <[email protected]> wrote:
>>
>> On Sep 10, 2013, at 5:39 PM, Nicole Boyle wrote:
>>
>> Steve, your thoughts mirror much of my inner dialogue recently! Thanks
>> for your input. It's got me thinking quite a bit (as you can see by
>> the lengthiness of this post).
>>
>>> My first question is one I asked earlier: On what grounds do you
>>> exclude the possibility of non-proportional hazards for Z?
>>
>> I treated Z the same way that I would treat a typical fixed variable
>> in a Cox model, since each individual's covariate value input into the
>> model, and not the coefficient value output from the model, depends on
>> time.
>>
>> After I split per each failure time and assigned Z's values per each
>> failure time per each subject, I fit the model for the outcome of
>> infection with Z as a fixed covariate:
>> . quietly stcox var1 var2 Z, schoenfeld(sch*) scaledsch(sca*)
>> The Therneau and Grambsch tests for non-zero slopes of the Schoenfeld
>> residuals were conducted:
>> . stphtest, detail
>> The scaled Schoenfeld residuals and their lowess smooths were also
>> plotted over time for visual assessment of PH:
>> . stphtest, plot(Z) msym(oh)
>>
>> The results of these analyses supported the notion of Z's independence
>> from time.
>
> I agree.
>
>>> Second, how do you plan to to show the impact of that covariate on
>>> cumulative incidence?
>>
>> I haven't been entirely clear in my previous posts. My study concerns
>> describing incidence per hospital, and there are two hospitals. My
>> previous mention of plotting CIFs concerned plotting overall
>> cause-specific CIFs per each hospital (for both causes: infection
>> [outcome of interest] and death [competing]) before any covariates are
>> involved.
>>
>> But to answer your question, thus far, I have not considered plotting
>> anything (CIFs or hazard functions) for Cox cause-specific covariate
>> effects, let alone Z. With regards to the inherent underlying question
>> of "What effects, if any, should I present?" I see these as my
>> options:
>>
>> Option 1: Plot functions for EVERY covariate for BOTH cause-specific
>> versions (infection and death) for BOTH hospitals. That's a lot of
>> clutter.
>
>>
>> Option 2: Plot functions for only those covariates found to be
>> significant. I'm not comfortable with data-driven decisions, and this
>> feels data-driven.
>>
>> Option 3: Plot functions for only those covariates considered (a
>> priori) to be the most important in this analysis. Z might be
>> considered the "most important" covariate, as it was one of the
>> primary scientific questions. However, I'm not sure if this is the
>> best approach, since this study is largely exploratory, and only
>> plotting Z might take the limelight away from other [possibly more]
>> important covariates. Z's importance doesn't mean that other
>> covariates are necessarily UNimportant.
>>
>> Option 4: Plot no covariate effect functions. This avoids the
>> downfalls of options 1-3, but also doesn't add any information.
>>
>> Any suggestions?
>>
>>
> Option 3: I'd publish plots for the important covariates Z and hospital
> But satisfy yourself about the correctness of the model for other
> covariates, and report that you checked, e.g., the proportionality
> assumptions. I'd alsocheck
> interactions of Z with predictors and of hospital with other predictors.
> There are exploratory analyses, to be sure, but you'd want to report any striking
> findings.
>
>>> 3. Select several time points t* and estimate future CIFs for those who
>>> switch at each t* ("switchers") and for those who don't switch
>>> ("stayers"), but who are at risk at t.
>>
>> This "switchers" and "stayers" method is a very interesting idea!
>> Thanks for explaining this; I'd never heard of it before.
>>
>> Wouldn't you have to make sure that those "stayers" remain "stayers"
>> throughout observation, or would it be sufficient enough to make sure
>> that the following two CIF functions look about the same?
>>
>> CIF for Z(t)=0, for all t (including switchers and never-switchers)
>> CIF for Z(t)=0, for all t (ONLY including never-switchers)
>>
>> Or is this^ what you meant by, "You can check this by estimating
>> separate F0s for the subgroup and its complement"?
> That's what I meant. I would use the version that included everyone who
> had Z = 0.
>> Do you know of a reference where I could learn more?
> No I don't.
>
> There's a "mover-stayer" literature to model the Z event process
> as a function of time (that's where I got the "stayer"),
> but that is irrelevant to your problem.
>
>
>>
>> Now, one more question of my own:
>> I've fit Cox models to estimate cause-specific covariate effects where
>> cause=infection [outcome of interest] and censored=death [our
>> "competing risk"]. However, I haven't yet considered fitting models
>> where cause=DEATH and censored=infection.
>>
>> With cause-specific Cox models, is this necessary to fit separate
>> models for the outcome of interest (cause=infection) AND the competing
>> cause (cause=death)? Or is it considered good practice to only fit
>> models for the outcome of interest (cause=infection)?
>
> Phil Clayton quoted advice from Jason Fine to model and present results for each outcome. That can be worthwhile. But as your primary interest is a factor related to death from infectious disease, I wouldn't give "all other causes" much time or space.
> I would present unadjusted CIFs for deaths from infectious disease, other deaths, and all deaths, overall and, possibly, by hospital.
>
>
> To sum up my suggestions:
>
> 1. Model causal associations (HRs) with -stcox- for infectious disease death
> 2. Graph hazard functions from -stcox- for infectious disease death.
> 3. Show betas, se's, and p-values for -stcrreg- results. The betas themselves (or the subHRs) are not important, except as they allow you to compare effects of different factors. The tests are important as they help draw conclusions about impact.
> 4. Show model-based -stcrreg-"future" CIFs) for infectious disease deaths. For your binary Z, -stcurve- will easily show you CIF for Z = 0 vs Z = 1 for all times. This is essentially the future CIF where the switch takes place at the start.
>
>
> Steve
>
>
>
>>
>> Nicole
>>
>> On Mon, Sep 9, 2013 at 3:53 PM, Steve Samuels <[email protected]> wrote:
>>>
>>> Nichole, I like your plan in general, but I do have two questions:
>>>
>>> To recapitulate: you have a binary Z(t) with values 0 or 1; people start
>>> at Z = 0; some switch during followup and remain with Z = 1 thereafter.
>>>
>>> My first question is one I asked earlier: On what grounds do you
>>> exclude the possibility of non-proportional hazards for Z?
>>>
>>> Second, how do you plan to to show the impact of that covariate on
>>> cumulative incidence?
>>>
>>> Below are a few recommendations. As the question applies to -stcox-, as
>>> well as to -stcrreg-, I omit the sub-distribution terminology. I've left
>>> out consideration of other covariates, but these obviously can be
>>> accounted for.
>>>
>>> 1. Plot smoothed hazard functions for Z(t) = 0, all t, and Z(t) = 1, all
>>> t. The shapes of the curves are of interest in themselves.
>>>
>>> 2. Plot and outfile CIFs that assume constant Z, as above:
>>>
>>> F0(t) = CIF when Z(t) = 0 for all t
>>>
>>> F1(t) = CIF when Z(t) = 1 for all t
>>>
>>> As F1 and F0 are discontinuous, you may find it useful to interpolate to
>>> values between event times
>>>
>>> In your data, some people never switch, and F0 could be a good
>>> description for this subgroup. You can check this by estimating separate
>>> F0s for the subgroup and its complement.
>>>
>>> Denote the maxima of F0 & F1 as Fmax0 and Fmax1, respectively.
>>>
>>> 3. Select several time points t* and estimate future CIFs for those who
>>> switch at each t* ("switchers") and for those who don't switch
>>> ("stayers"), but who are at risk at t*. Here, I denote these CIFs
>>> FF1(t|t*) and FF0(t|t*).
>>>
>>> For t ≥ t*, FF1 and FF2 are estimated as follows:
>>>
>>> switchers: FF1(t|t*) = (F1(t)-F1(t*))/(Fmax1-F1(t*))
>>>
>>> stayers: FF0(t|t*) = (F0(t)-F0(t*))/(Fmax0-F0(t*))
>>>
>>> FF0 and FF1 are zero at t*. As they start out with the same value,
>>> plotting both on one graph allows one to visually assess the impact of
>>> switching.
>>>
>>> I end by noting that the future CIF approach works only because you have
>>> the simplest type of time-varying covariate. Perhaps you can think of
>>> CIF comparisons for other types.
>>>
>>> Steve
>>>
>>>
>>>
>>>> On Sep 5, 2013, at 2:48 PM, Nicole Boyle wrote:
>>>>
>>>> Wonderful, Phil, thanks for the explanation! I'm going to go ahead and
>>>> plot both outcomes.
>>>> Thanks so much to Phil, Steve, and Adam... this has been a
>>>> tremennnndously helpful and thought-provoking conversation. I have
>>>> learned so much. I very much appreciate all the time each of you have
>>>> taken to help me with this.
>>>>
>>>> To sum up, here are the following analysis choices I've made per our
>>>> discussion. Feel free to chime in if anything rubs you the wrong way:
>>>>
>>>> -Modeling of hazard ratios will no longer be through the Fine-Gray
>>>> model. Instead, covariate effects on the cause-specific hazard will be
>>>> estimated through the Cox model, where the competing risk is censored.
>>>> The only cause-specific event to be modeled will be the primary
>>>> outcome of interest.
>>>>
>>>> -The CIFs will be plotted in both forms:
>>>> * Cause-specific CIFs for both the primary outcome and competing
>>>> outcome (-stcompet-)
>>>> * Subdistribution CIF for just the primary outcome (-stcrreg-).
>>>> Simply for comparison's sake.
>>>>
>>>> -I'm going to use -stsplit- instead of the -tvc- option to capture the
>>>> time-varying nature of the time-varying risk factor, and then throw
>>>> this risk factor into the model as a simple ["time-invariant"]
>>>> covariate. I've decided to split at failure times, and expand the
>>>> coding of the TVC risk factor to be "on" or "off" for each created
>>>> time slot. Doing so will exploit the Cox model's maximum partial
>>>> likelihood estimator property (briefly explained on page 13:
>>>> http://www.stata.com/manuals13/ststsplit.pdf ).
>>>>
>>>> Nicole
>>>>
>>>> On Wed, Sep 4, 2013 at 4:17 PM, Phil Clayton
>>>> <[email protected]> wrote:
>>>>> From memory he used an example of breast cancer.
>>>>>
>>>>> If you graph the CIF of cancer recurrence by age, older patients have a lower incidence of recurrence.
>>>>>
>>>>> That looks good for older people until you graph the CIF of death - older patients have a higher incidence of death. Since death competes with recurrence, this makes the older patients look better on the recurrence CIF, but it's because they're dying before they get a chance to have recurrence. Doesn't look so good for older people any more.
>>>>>
>>>>> You need to look at both outcomes in order to disentangle the competing events and understand what's actually going on. By selectively presenting one outcome you're not telling the whole story.
>>>>>
>>>>> Phil
>>>>>
>>>>> On 05/09/2013, at 6:37 AM, Nicole Boyle <[email protected]> wrote:
>>>>>
>>>>>>> I went to a talk by Jason Fine last year and he gave the following general advice:
>>>>>>> - use a Cox model for each of the competing outcomes (in your case infection & death)
>>>>>>> - use a Fine-Gray model for each of the competing outcomes
>>>>>>> - present all of those results
>>>>>>
>>>>>> Thanks for the advice! What's the utility of presenting model results
>>>>>> for the outcome of death if death is not an outcome of interest in my
>>>>>> study? Feel free to direct me to a paper if you'd like.
>>>>>
>>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/