Generalized Linear Latent and Mixed Models
Speakers:
Sophia Rabe-Hesketh, Institute of Psychiatry, Andrew Pickles,
University of Manchester, and Colin Taylor, Addiction Research Unit
|
We describe a Stata program called gllamm that can fit a large number
of generalised linear latent and mixed
models. These models are extensions of the random intercept models
that may be estimated in Stata 6 using xtreg, xtlogit,
xtpois etc. for cross-sectional time-series or other clustered data.
All these models include a random intercept for clusters in the linear
predictor. If there is a single explanatory variable x, the linear
predictor is given by
ηij=β0+β1xij+ui
where the index ij refers to the ith “level 1”
unit clustered within the jth “level 2” unit, e.g.
observation times within subjects or pupils within schools, and the random
effect ui is usually assumed to have a normal distribution
with mean zero.
The program gllamm allows any combination of five basic extensions to
the random intercept model: (1) discrete random effects distributions, (2)
multi-level models, (3) random coefficients (4) factor loadings and (5)
mixed responses.
Similarly to the xt programs, gllamm requires the data to be
in “long” form with all responses stacked into a single variable
and the cluster index (or indices) stored in separate variable(s).
The program simply uses Stata’s ml commands with method
deriv0 to maximise the likelihood which is evaluated by numerical
integration. The five extensions to the random intercept model are described
below using the examples that are also used in the talk.
- Discrete random effects distributions: We may wish to assume that
there are several latent classes or groups of subjects where each group is
homogeneous in its random effect.
- Multi-level models: The level 2 units may be clustered within level
3 units. For example, there may be multiple observations (k) per person
(j) clustered in families (i). The linear predictor now includes
two random effects, one for families (level 3) and one for subjects (level
2):
ηijk=β0+β1xijk+ui(3)+uj(2).
- Random coefficients: The coefficient of an explanatory variable may
differ between level 2 units. For example, the effect of pupil's (j)
maths results in year 3 on maths results in year 5 may differ between schools
(i). The linear predictor now has a random intercept
ui(0) and a random slope
ui(1) where the two random effects may be
correlated,
ηij=β0+ ui(0)+(β1+ ui(1))math3ij.
- Factor loadings: Several variables may load on a latent variable.
For example, on a psychometric test, the results on the items (j)
for each subject (i) could be modelled as a 2-parameter Rasch
model
ηij=βj+ uiλj.
Here the random effect ui is a measure of the subject’s
ability, -βj is a measure of the
difficulty of item j and λj is a “factor loading” representing
the effect of the subject’s ability on their performance on item j. A
separate loading is estimated for each item.
- Mixed responses: If the items or variables are of different types,
e.g. continuous and dichotomous, then different generalised linear models
(families and links) need to be specified for different `observations'. An
example of this is logistic regression with a continuous explanatory or
`exposure' variable which is subject to measurement error. We need to model
the measured exposure and dichotomous outcome simultaneously using
μij=β0+ ui
and
respectively.