Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Daniel Waxman <dan@amplecat.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: -gllamm- vs -meglm- |
Date | Wed, 3 Jul 2013 12:03:08 -0700 |
Joseph Coveney wrote: Just for clarification, is PROC GLIMMIX fast and light on gigabyte-sized datasets even when it's using seven-abscissa adaptive Gauss-Hermite quadrature as its estimation method? According to its documentation, "The default estimation technique in generalized linear mixed models is residual pseudo-likelihood with a subject-specific expansion (METHOD=RSPL)."* ------------------------------------------------------------------ Joseph and Tim, thanks for your replies. I can't speak GLIMMIX's performance using that particular estimation method; the method that I've been using is called "NRRIDGE" (Newton-Raphson with Ridging). To give an example, I just ran a model with 186 variables, a random intercept with 5,269 groups, and 270,684 observations (a 1% sample), using 1.3 seconds of CPU time! So far I haven't been able to get this to run at all in Stata, even using the numerical integration options. For me, it's all about the destination, not the journey, meaning that I couldn't care less what sort of estimation technique is used as long as the results are correct. If two methods produce correct results and one takes minutes and the other takes hours or fails to converge at all, then I'll take the first one. Of course, the validity of the results might be the rub. Does anybody know of a good reason to be wary of the NRRIDGE algorithm? I've been a long-time Stata fan; believe me, I'd love to never have to use anything else. But data seems to be getting bigger faster than memory is getting cheaper, so the jury still seems to be out as to whether that is going to be possible. Dan * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/