Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: multilevel logistic and reml


From   <[email protected]>
To   <[email protected]>
Subject   st: RE: multilevel logistic and reml
Date   Thu, 14 Nov 2013 10:06:31 +0000

Garry Anderson <[email protected]>:

Thanks for the additional information. (I don't recall seeing this material in your previous message; though I've no idea how it got clipped.)

I'm getting out of my knowledge zone on this topic. But, ploughing on ...

Like you, I'd be worried that different estimators are giving very different results. Unlike you, I'd not necessarily be wanting to opt for one (REML) over the others at this stage. 

To me, the 'adaptive quadrature failed to converge' messages are warning signs of some problem with your model. I don't know exactly what it is, but I suspect that there are some very sparse cells in a cross-tabulation of  categorical predictor "var3levels" against your outcome variable, thereby leading to (almost) "perfect prediction" problems.  Have you checked this?

To me, an odds ratio of near 8 is very large. Is it actually credible?   It may be that the "workarounds" in the other estimators are simply allowing you to derive an estimate.

Moreover, now you mention that your model is a sort of transition model, with "var3levels" summarising the value of the lagged outcome, I'd raise even more questions:

* It's unclear to me what "previous outcome doesn't exist" means. Obs is missing at previous time point?  Is this missingness at random? Can you simply combine people with 2 obs on the outcome with obs with only one (second period)?
* If you have a model with a binary depvar, a lagged depvar, and also want to introduce an individual-specific random effect, then the "initial conditions" problem immediately comes to mind, i.e. that there is a correlation between lagged depvar and that random effect ==> biased estimators unless corrected. James Heckman wrote about this in the early 1980s; Jeff Wooldridge (J Applied Econometrics 2005) is a recent much-cited contribution providing a now widely-used method for addressing the initial conditions problem, which can be implemented using standard random effects software.  Google on "dynamic random effects probit models". (The same lessons apply to analogous binary logit models.)


Stephen

------------------------------

Date: Wed, 13 Nov 2013 09:24:56 +0000
From: Garry Anderson <[email protected]>
Subject: st: RE: multilevel logistic and reml

Thank you Stephen for your suggestions and presentation.


I was wishing to explore REML because it is used by other software (Genstat) and because it seems to give more reasonable parameter estimates than variations of adaptive quadrature.

Noh M and Lee Y (2007) REML estimation for binary data in GLMMs. Journal of Multivariate Analysis 98: 896-915

Lee W and Lee Y (2012) Modifications of REML algorithm for HGLMs. Stat Comput 22: 959-966

My following example shows how the odds ratio of 0.85 in Stata is very different to other software, with an odds ratio of about 7. Can adding a random effect cause such a change to a fixed parameter?

The following seems to have been clipped from the previous reply.

My dataset of 4020 observations provides the following odds ratios (OR) for a parameter

Stata (unilevel)   -logit-     OR = 7.96
Stata -melogit  ,intp(25)-  OR = 0.85
Genstat  -glmm-                 OR = 6.48
SPSS     -genlinmixed-       OR = 6.48

- -melogit y01 i.var3levels ||id: , intp(25)-


where
var3level = 0 if the previous observation within id did not have the outcome
var3level = 1 if the previous observation within id did not exist
var3level = 2 if the previous observation within id had the outcome
This is like a transitional model.
The odds ratio estimate of 0.85 when using -melogit- is very low and is my main cause for concern.

The intmethod option of mcaghermite, in combination with intp(25), also gives an odds ratio of 0.85.
Using intp(7) reports 'adaptive quadrature failed to converge' after each iteration from the 7th to the 44th, but then provides an OR = 0.00023.
Using intp(9) message 'adaptive quadrature failed to converge' continues until at least 195 iterations without an estimate of parameters.

The proportions of the outcome, y01, for the 3 levels of the categorical variable are
Category 0   129 / 899 (14%)
Category 1   657 / 2813 (23%)
Category 2   176 / 308 (57%)

Category 0 is the proportion y01 at the current observation, given the previous observation within id was 0 for y01.
Category 1 is the proportion y01 at the current observation, given there was not a previous observation within id.
Category 2 is the proportion y01 at the current observation, given the previous observation within id was 1 for y01.

The above four odds ratios are for category level 2 compared with level 0.

There are 2813 groups (clusters) and there are 1959 groups (70%) with a single observation. The number of observations per group varies from 1 to 8, with a mean of 1.4.

- -xtlogit y01 i.var3level ,i(id) intp(25)- reports an OR of 0.85 and rho = 0.65.

Schall R (1991) Estimation in generalized linear models with random effects. Biometrika 78: 719 - 727

Kind regards,
Garry Anderson
[email protected]

Please access the attached hyperlink for an important electronic communications disclaimer: http://lse.ac.uk/emailDisclaimer

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index