Gary Anderson posted about some difficulty he is having with
parameter estimates using population-averaged generalized
estimating equations (PA-GEE).
-------------------begin excerpted post------------------------------------
Why is it that a coefficient can differ by up to 4 standard errors
when an xtgee with corr(ind) is fitted compared to an xtgee with
corr(exch) ?
I have about 20 clinics and 40 patients per clinic, with the one
covarite varying at the patient level, and the id variable is the clinic.
The intraclass correlation coefficient is about 0.4. My understanding
is that the expectation is that the coefficient should not differ when
a different correlation structure is fitted. The coefficient goes toward
the null when the exchangeable structure is fitted.
--------------------end excerpted post------------------------------------
My understanding is similar to Gary's in that, when using marginal
models such as PA-GEE, the parameter estimates for the means
(i.e., regression coefficients) ought to be resilient to the working
correlation structure; however, they do require that the model is
otherwise properly specified, viz., distribution family, link function
and inclusion of all important covariates. Whenever you do
encounter dramatic changes in such parameter estimates with
different working correlation structures, I'm told that it results from
improperly specifing the link function or distribution family, or that
you've not included important covariates in the model. In Gary's
case, he has a single covariate. It is possible that the sensitivity
that he observes results from omitted covariates. Unfortunately, he
has only twenty clusters and, with PA-GEE, this would argue
against adding any more covariates.
At the risk of sounding like a broken record on this: Gary might
want to look into -gllamm- as an alternative to -xtgee-. The
exchangeable correlation structure that is implied by the study
design (patients within clinic) is compatible with generalized linear
mixed-effects models. The relatively small sample size would not
be such an issue with -gllamm- as it is with PA-GEE. Convergence
ought to be reasonably rapid with the limited sample size (20) and
number of parameters to estimate that Gary has in the model. And
-gllamm- has a wealth of distribution families and link functions, as
does -xtgee-.
Joseph Coveney
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/