Here are my 2 cents:
It is my understanding that now p-values are often
discouraged from "Table 1" at least in part for the
reason that you describe.
I think in part this is a confusion of validity vs.
precision.
What do you mean by your randomization is "solid"?
Randomization *does not* (and can never) guarantee a
completely balanced sample. So if there is convincing
evidence, qualitiative or quantitaitve, that the two
arms are not the same then it does call into question
the equivalence of the two groups. A perfectly
randomized study could produce by chance two very
different groups (although it is not likely to do so)
and thus call into question the validity of the
results, but would not necessarily invalidate the
results. Especially if numerous factors were compared
in Table 1, there may have been a difference simply by
chance, in two groups that are actually equivalent.
Tim
--- "Christopher W. Ryan" <[email protected]>
wrote:
> Having read the Statalist FAQ, and previous
> correspondence about general
> statistical questions, I hope no one minds . . . .
>
> Among my teaching duties in my medical school and
> family practice
> residency is "critical appraisal of the medical
> literature." I try to
> go over principles of good design and valid
> analysis. A question
> frequently comes up when we discuss randomized
> controlled trials. In
> these articles, there is almost always a "Table 1,"
> that describes the
> baseline demographic and clinical variables of the
> two arms (say,
> placebo and active drug, for example.) There are
> usually *a lot* of
> baseline measurements. Each one is usually listed
> with a "P value,"
> indicating whether the placebo and active drug
> subjects differed on that
> measurement.
>
> Then the manuscript goes on to describe the rest of
> the study, and the
> results . . .
>
> If the results show an advantage for the active
> drug, readers (including
> my students and residents) will often go back to
> "Table 1" and say, "Oh
> but look, the samples were not identical. Blah-blah
> was significantly
> higher in the placebo arm to begin with. Therefore
> I can't accept these
> results as valid."
>
> I've never agreed with that. So I want to outline
> my chain of reasoning
> here and see if I've got it straight.
>
> There are two premises in a randomized controlled
> trial with two arms:
>
> 1. The two samples are drawn randomly from the same
> population
> 2. The active drug actually has no effect (the null
> hypothesis)
>
> And then there are the results (R).
>
> If 1 and 2 are both true, we can look at R and
> calculate how likely we
> were to see results that "extreme" or more so.
> That's the P value. If
> P < the conventional 0.05, we say, "Gee, if 1 and 2
> are both true, we
> *might* have seen results R, but only 5% of the time
> or less, and that's
> pretty unlikely. But we *did* see R. Therefore
> either 1 or 2 must be
> untrue. And I'm confident my randomization was
> solid. Therefore 2 must
> be untrue, and the drug really does have an effect."
>
> There is nothing this chain of reasoning that
> requires the samples to be
> indentical/indistinguishable. And for every 20
> baseline variables
> compared, you'd *expect* about 1 of those baseline
> variables to have a P
> of < 0.05 The statistical techniques have
> "built-in" accomodation for
> this. This does not invalidate the conclusions.
>
> It is a difficult concept for my learners to grasp.
> Or maybe I've got
> it wrong?
>
> Thanks.
>
> --Chris Ryan
>
> *
> * For searches and help try:
> *
> http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
=====
[email protected]
__________________________________
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/