Thank you Joseph and Kieran. Obviously this was not
the easy question I though it was. I have spent
several days contemplating the answers and playing
around with my data. Although I find Kieran's
conditional logistic approach appealing, I understand
and agree with Joseph's concerns and objections. Faced
with the need to analyze these data and the eventual
submission for publication I fear that reviewers may
disagree with which ever method I select. The issue
becomes more complicated when one considers the effect
of additional covariates such as sex on the
intervention.
Regardless of all this, I appreciate tremendously
Joseph and Kieran comments and time thinking about
this problem.
Ricardo.
--- Joseph Coveney <[email protected]> wrote:
>
> Kieran McCaul posted results from a randomized
> parallel-group design study to
> illustrate the use of conditional logistic
> regression. The study randomized
> households to an intervention designed to promote
> banning of smoking in the
> home. Policy in the home was measured before and
> after intervention. Kieran
> invited Ricardo and I to respond with what we think
> of advocating conditional
> logistic regression to assess the efficacy of the
> intervention for before-and-
> after studies based upon the results posted for that
> study.
>
> I don't claim to speak for Ricardo, but his original
> question related to
> imbalances in the baseline rates of the outcome
> between the two parallel
> intervention groups. It appears that Kieran's study
> was successful in its
> randomization (or used stratified randomization and
> didn't lose too many
> households to dropout), because the proportions of
> households banning smoking
> at baseline were nearly identical between the
> intervention groups. With
> essentially identical rates of baseline, there would
> be little or no cause for
> concern about confounding due to it and little
> statistical difference in
> including baseline as a covariate. And, in fact,
> both conditional logistic
> regression approach and the so-called ANCOVA-like
> multiple logistic regression
> approach give essentially similar results in this
> balanced study. (I think the
> same would have obtained for Ricardo's study had the
> baseline rates of seatbelt
> use been similar between the two intervention
> groups.)
>
> But, let's look at the issue of which approach is
> more suitable when the
> concern is, as it was for Ricardo, to analyze an
> intervention effect _in the
> face of an imbalance in the baseline rates of an
> outcome_.
>
> If Kieran will indulge me one more time to use a
> fictional dataset to
> illustrate a point, let's say that Kieran's
> randomization method did not
> stratify on baseline household smoking policy, and
> suffered an unfortunate
> imbalance due to chance, for instance a 50 : 50
> ratio of households banning
> smoking at baseline in the nonintervention group,
> but a 75 : 25 ratio in the
> intervention group. Let's say that 2 of the 50
> households that previously
> banned smoking in the nonintervention group now
> permit it, a worsening of 4%
> (if your health policy is to ban smoking), and that
> only 1 of the 50 households
> that didn't ban smoking now do so in the
> nonintervention group, a meager
> improvement of 2%. Let's say that 4 of the 75
> households that banned smoking
> at baseline switched and permitted smoking in the
> home after the intervention,
> and 2 of the 25 households that didn't ban smoking
> switched as a result of the
> intervention. The results of the intervention are a
> slightly greater 5.3%
> worsening (compare to 4%) in the former nonbanning
> household population, but a
> much greater 8% (compare to 2%) improvement among
> the formerly permissive
> households.
>
> Now, the effects of intervention are no great
> shakes, but I think that it would
> be safe to say that it's not *nothing*, especially
> if you somehow take into
> account the possible confounding effect of the
> chance unfortunate imbalance in
> baseline policy between treatment groups.
>
> But, by the conditional logistic regression
> approach, it *is* nothing--the odds
> ratio for both nonintervention and intervention
> groups is 0.5 (McNemar's test
> uses only the off-diagonal values and ignores the
> diagonal values) so the ratio
> of the two odds ratios is 1.0, and this is what the
> conditional logistic
> regression dutifully reports: the period term is
> 0.5 and the interaction
> term's odds ratio is 1.0 with a Z-statistic of 0.00
> and a p-value of 1.00.
> Granted, the confidence interval encompasses a lot,
> but the point estimate and
> hypothesis test for the interaction term (which is
> ostensibly the effect of
> intervention) just don't give the same take-home
> message as inspection of the
> data. So, my conclusion differs from Kieran's on
> this; I don't think that
> conditional logistic regression is valid to test for
> differences between
> treatment effects (differences between treatment
> differences, which are between-
> subject effects) in parallel-group designs with a
> repeated binary outcome
> measure, especially in the presence of baseline
> differences in the outcome
> measure, which are ignored in the conditional
> logistic model.
>
> In contrast, the ANCOVA-like, baseline-as-covariate
> multiple regression
> approach does provide a separate, and I think
> competent, handling of baseline
> differences and their potential for confounding. In
> the fictitious example,
> this approach shows the pronounced effect of
> baseline smoking policy as
> expected, and it shows that the odds ratio for
> intervention isn't 1.0 given
> baseline differences between intervention groups.
> The saturated model (with
> the interaction term) also helps to put the
> potential for confounding into
> perspective. (The do-file for all of this is below
> for anyone interested.)
>
> It seems that at least some of the discrepancy
> between the two approaches
> reflects Simpson's paradox. This is the same
> underlying phenomenon that
> results in bias in logistic regression coefficients
> (and in nonlinear
> regression, in general) when important covariates
> are left out of the model.
> This is what Frank E. Harrell Jr.'s lecture dealt
> with in the URL given in my
> last posting. And it relates to the
> "noncollapsibility of odds ratios" that
> epidemiologists sometimes refer to.
>
> In fairness to us all (Kieran, Ricardo and me), it
> seems that the matter of
> which approach is better isn't completely settled
> even for *linear* models,
> where this incollapsibility-of-odds-ratios
> phenomenon and the incidental
> parameters problem don't apply: there is a thread
> ("Repeated measures and
> including time zero response as baseline covariate")
> on sci.stat.consult that
> was started on May 7 of last year by Frank Harrell.
> Professor Harrell wrote a
> well received book on regression modeling and is now
> chairman of a department
> of biostatistics, yet even he asks, "Has anyone come
> across some practical
> guidance for when to include the first measured
> response (at time zero) as a
> baseline covariate as opposed to the first repeated
> measurement in a
> longitudinal data analysis?"
>
> Joseph Coveney
>
>
-------------------------------------------------------------------------------
>
> clear
> tempfile tmp
> set obs 100
> generate byte ban0 = _n > _N / 4
> generate byte ban1 = ban0
> replace ban1 = !ban1 in 50/53
> replace ban1 = !ban1 in 1/2
> *
> * Intervention group
> *
> display 4 / 75 // switching by banners
> display 2 / 25 // switching by permitters
> mcc ban1 ban0
> generate byte intervention = 1
> save `tmp'
> clear
> set obs 100
> generate byte ban0 = _n > _N / 2
> generate byte ban1 = ban0
> replace ban1 = !ban1 in 50/52
> *
> * Nonintervention group
> *
>
=== message truncated ===
=====
Ricardo Ovaldia, MS
Statistician
Oklahoma City, OK
__________________________________
Do you Yahoo!?
Yahoo! Finance: Get your refund fast by filing online.
http://taxes.yahoo.com/filing.html
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/