Alex Gamma wrote:
I have longitudinal data from an age cohort of 591 people at 6 time-points
over 20 years. I have two psychiatric diagnoses A and B, and I want to look
at the question of whether prior occurence of A predicts current A or B and
vice versa (i.e. whether prior B predicts current B or A). So I constructed
additional variables A_prior and B_prior coding for any prior occurence of
diagnosis A or B, and I ran the models
xtgee A A_prior B B_prior some_covariates, i(id) fam(bin) link(logit)
corr(exch) robust
xtgee B B_prior A A_prior some_covariates, i(id) fam(bin) link(logit)
corr(exch) robust
Two questions:
1) A sociologist colleague doubted that it is valid to include this kind of
"any prior occurence" variable, or indeed any lag-variable into the GEE
model, but I don't see any reason as to why not. But just to be sure I
checked with the experts before publishing these models: is he right?
2) If the A_prior and B_prior variables are admissible, is it correct to use
an exchangeable correlation structure or should I use an independent
structure? (this is what J.W.R. Twisk seems to recommend in "Applied
Longitudinal Data Analysis for Epidemiology", 2003, Cambridge University
Press).
--------------------------------------------------------------------------------
Take a look at Chapter 12 (Time-dependent covariates) in P. J. Diggle, P. J.
Heagerty, K-Y. Liang and S. L. Zeger, _Analysis of Longitudinal Data_ Second
Edition. (Oxford: Oxford Univ. Press, 2002), pp. 245-81.
It describes the use of lagged variables in a models fit by GEE for, for
example, covariate endogeneity.
According to the same source, at least for cross-sectional analysis, you
should use an independence working correlation, unless you can satisfy the
"full covariate conditional mean assumption," which the authors describe.
Joseph Coveney
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/