Marie Olson posted a general question about the use of -xtgee- and
-xtlogit-.
---------begin excerpted post---------
Can anyone tell me why I might choose xtgee over xtlogit when
analyzing a dataset of 30 countries for a period of 41 years? My
variables are the following:
dep v: war/no war
indeps: policy/no policy, regime score, gdp change, peace years
The hypothesis is that certain types of policies are associated with
the occurrence of violence, controlling for regime score, gdp
fluctuations, and the number of peace years prior to the current
year.
---------end excerpted post---------
Gary King (http://gking.harvard.edu/stats.shtml) has looked into
statistical models for relatively rare events in similar kinds of
surveys, and has made ReLogit available for Stata.
As to Marie's question, I'm no expert, but -xtgee- (or -xtlogit, pa-)
would seem to have certain advantages, such as the ability to use
autoregressive working correlation structures. With 41 years of
data, such correlations might be apparent. Parameter estimates
(regression coefficients) and parameter standard errors from
population-average generalized estimating equation (PA-GEE)
approach, as implemented in -xtgee, robust-, are relatively resilient
to misspecification of the working correlation structure.
On the other hand, PA-GEE works best with lots of panels; with
only 30 nations, the panel number might not be sufficient to give a
lot of confidence in hypothesis testing results from -xtgee-. In
addition, I was led to beleive that PA-GEE isn't so great with long
panels--somewhere I had read that panel lengths of around six or
fewer are ideal. A good source for advice is the user's manual.
If she believes that there is important autocorrelation, it might be
worth considering grouping the 41-year span into epochs of similar
lengths of time and use -xtgls- on the proportion (or arcsin-
transformed proportion) of time at-war in each successive epoch.
With blocking the time span into epochs, -xtgee- and -xtlogit- would
have fewer intervals to cope with, and might be better behaved.
For these epochs, alternative approaches could be considered,
such as Poisson regression (or zero-inflated Poisson regression,
hopefully) for the number of wars or years at war in successive
epochs.
I have a concern about using number of years at peace as a
predictor in Marie's statistical model of her data. It would seem that
such a predictor and the response variable would be confounded --
years at peace is implied in a war/no-war dependent variable in a
longitudinal survey. Last month, Wiji Arulampalam posted to the
list about having difficulty getting proper convergence with -
xtprobit- and she wondered whether it might have to do with the
presence of a time-lagged variable in the list of predictors in her
case. It seems that an analogous situation arises in Marie's case.
Perhaps the use of an autocorrelation structure will obviate the
need for years at peace as a predictor.
Joseph Coveney
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/