As a co-author of the -metan- program, I'd just like to confirm Roger
Harbord's explanation about Mantel-Haenszel (MH) weights. They are
unintuitive, they don't quite reflect the study's "weight" (in the normal
way of thinking about weights), but despite this I would use the MH method
anyway.
To illustrate the quirk further, think about this ficticious example with
two studies
study 1:
group 1 10/100 events
group 2 20/100 events => RR=0.5
study 2:
group 1 20/100 events
group 2 10/100 events => RR=2
Below is what -metan- would do to these data :
Study | RR [95% Conf. Interval] % Weight
---------------------+---------------------------------------------------
1 | 0.500 0.247 1.014 66.67
2 | 2.000 0.987 4.054 33.33
---------------------+---------------------------------------------------
M-H pooled RR | 1.000 0.624 1.602 100.00
---------------------+---------------------------------------------------
Heterogeneity chi-squared = 7.39 (d.f. = 1) p = 0.007
I-squared (variation in RR attributable to heterogeneity) = 86.5%
Test of RR=1 : z= 0.00 p = 1.000
The MH method gives the second study less weight than the first, for the
reasons that Roger stated: to get the correct relative risk of 1, the first
RR needs twice as much weight as the second to contribute equally. The MH
weights should therefore be treated with caution, which is deeply
unfortunate since a major reason for presenting a forest graph is to
emphasize how much each study contributes.
Inverse-variance weightings are far more intuitive. Both of the above
studies would get identical weights, and "weight" here does mean the same
as "contribution". But the estimated variances are based on large sample
theory, and consequently the weights (and therefore the pooled estimate)
are inaccurate for rare events (based on as yet unpublished simulation work
which we really should get finished). Looking at Roger's data, I suspect he
is better off avoiding the inverse-variance method and sticking instead to
the MH weights, despite the above issue over interpretation.
As a final note, if the study effect sizes are relatively homogeneous then
the discrepancy between "contribution" and "MH weight" should be fairly
small. If on the other hand the studies vary, the bigger question is
perhaps "should I be pooling these studies using a fixed effect model at
all?". But these arguments seem irrelevant here, as imprecision of
individual studies seems to be the reason behind the variety in the
estimated RRs.
And on an unrelated note, we are working on an updated version of -metan-,
still for the moment in version 7 graphics but with a few new features.
I'll post a note with more details to the list when it's ready later this
week.