Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Stas Kolenikov <skolenik@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: problem with GLLAMM and a bernoulli mixed model |
Date | Wed, 9 Mar 2011 17:47:10 -0500 |
-gllamm- IS maximum likelihood estimator, so I am not sure as to why you are making the distinction. As I said, you may have problems with empirical identification in your particular data set, but other than that, I cannot think of any other obvious explanation. How many data points do you have per id? On Wed, Mar 9, 2011 at 5:24 PM, David Pacheco <pacheco.david@gmail.com> wrote: > Stas, > > I've tried to fit another version of the model but without neither > factor loading nor constraints to the variance of normal latent > factor, case that I understand is the same who I've describe > previously (factor loading = std of latent factor). In these case > happen a similar thing, the variance of the latent factor is > reasonable but depend a lot of the integration points used. Idem when > I've changed the link function from probit to logit. > > In other hand, the same model and data have been fitted, with success > by a friend, but he used the traditional Stata's Maximum Likelihood > Method and the solutions doesn’t have the problem of high sensitivity > to the cuadrature method and always with much less variance of the > latent factor than the solution by GLLAMM. > > My model is a longitudinal model where the only one latent factor or > random effect change over the time and impute the variability, so the > time "fecha_n" is the "id", while the cross-section is collapsed > because all the items are infinitely homogeneous with the same success > probability, so the variable "n_cuotas" and "df_cuotas" contains the > number of items and the number of success over the time, respectively. > The model is in the spirit of Vasicek's Model of Credit Risk, where: > all the loans of an portfolio have the same probability of default; > the number of loans is very large and all of them are homogeneous > (assumptions close to my data); and a systematic latent factor over > the time produces the variability in the portfolio’s probability of > default. > > Why have I been working with GLLAMM?, because I think is more easy, > than traditional ML on Stata, subsequently generalize the basic model > including: latent coefficients on covariates; other links functions; > and multivariate latent factors approach. But I don't know what is > wrong in my code or with GLLAMM?, especially with these simple > binomial mixed models, or what is the source of this positive skewness > in the variance of latent factor and its sensitivity to the > integration points? > > Any suggestions would be very much appreciated! > > 2011/3/9 Stas Kolenikov <skolenik@gmail.com>: >> David Pacheco reported some difficulties in getting convergent >> solutions in -gllamm- with binary dependent variable factor analysis >> (no covariates) model. >> >> I suspect you might have (empirical) identification problems with this >> model. If you don't really have variability at the second level, then >> setting the variance parameter to 1 will send your loadings to >> infinity. Are your likelihoods the same for different models? You >> would want to try a likelihood ratio against the simple -probit- >> model, and you would probably want to run -xtprobit, re- with your >> data, just to see what comes out (it additionally imposes constraint >> of equal loadings of different items, but if that is at least >> approximately true, then you will get an estimate of the factor >> variance from it to gauge how far it is from zero). >> >> You would probably want to put in different intercepts for both >> -gllamm- and -xtmixed- models using >> >> tabulate items, gen( item_dummy ) >> gllamm response item_dummy*, nocons ... >> >> if you have varying probabilities of success in different items. Thus >> far, you have imposed an implicit constraint of equal probabilities, >> and -gllamm- might be trying to accommodate that with wildly varying >> factor loadings, the only thing you allowed to vary in the model. >> >> On Wed, Mar 9, 2011 at 10:40 AM, David Pacheco <pacheco.david@gmail.com> wrote: >>> Hello, >>> >>> I'm seeking suggestions about a problem with GLLAMM. I've been working >>> with a specific and simple Bernoulli mixed model: link probit; >>> binomial family; 2 levels; I don’t have covariate in any level; and >>> in the second level I have only one latent factor with normal >>> distribution and std=1 plus its factor loading. In general the model >>> is very simple, with 3 parameters. I've used this code: >>> >>> **************! >>> gen cons=1 >>> eq fech1: cons ***this allow me to create the equation for the >>> latent variable that only have a factor loading >>> constraint def 1 [fec1_1]cons = 1 *** this constraint the std=1 for >>> the normal factor >>> gllamm df_cuotas, i(fecha_n) link(probit) family(binom) >>> denom(n_cuotas) eqs(fech1) constr(1) frload(1) **** where df_cuotas >>> is the response and I don't have covariate >>> ************** >>> >>> The model looks very simple, but when I've tried with different number >>> of integration points (like nip(8), ... nip(20), nip(40), etc ) plus >>> the traditional or adaptative cuadrature, the solution for the factor >>> loading change a lot, so is very sensitive to the cuadrature setting. >>> >>> After that, I've tried to add start values to, maybe, neutralize this >>> sensitivity to the cuadrature setting. I've used like start values a >>> skew solution that I know for this model, in this way: >>> >>> **************! >>> matrix list e(b) *** for see the structure of the parameter matrix >>> >>> Stata show me this: >>> >>> e(b)[1,3] >>> df_cuotas: fec1_1l: fec1_1: >>> _cons cons cons >>> y1 0 1.1 .5 >>> >>> copy a=e(b) *** to copy the structure of the parameter matrix >>> matrix a[1,1]= -1.1 *** replace the values on matrix "a" with my initial values >>> matrix a[1,2]= 0.016 >>> matrix a[1,3]= 1 >>> ************** >>> >>> Thus, I've run the following code: >>> >>> ************** >>> gllamm df_cuotas , i(fecha_n) link(probit) family(binom) >>> denom(n_cuotas) eqs(fech1) constr(1) frload(1) from(a) >>> ************** >>> >>> but Stata send me the error: >>> >>> ****** >>> initial vector: extra parameter df_cuotas:_cons found >>> specify skip option if necessary >>> (error occurred in ML computation) >>> (use trace option and check correctness of initial model) >>> ****** >>> >>> However, the parameter "df_cuotas:_cons" exist in the model and in the >>> matrix e(b). I thought that I had to delete the parameter " >>> fec1_1:cons" from the matrix "a" of initial values, because this is >>> the std. of the latent variable that I've constrained to 1. >>> Nevertheless, Stata send me the same error. >>> >>> My questions: >>> >>> 1) Is something wrong on my code or is a common problem in GLLAMM, and >>> in this kind of models, the sensitivity of loading factor to the >>> cuadrature setting?...because with every number of integration point >>> that I've tried the solution of the factor loading has changed a lot >>> >>> 2) What’s wrong in my code or in my matrix of initial values, when I >>> try to use "from"? >>> >>> Any suggestions would be very much appreciated! >>> >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/statalist/faq >>> * http://www.ats.ucla.edu/stat/stata/ >>> >> >> >> >> -- >> Stas Kolenikov, also found at http://stas.kolenikov.name >> Small print: I use this email account for mailing lists only. >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ >> > -- Stas Kolenikov, also found at http://stas.kolenikov.name Small print: I use this email account for mailing lists only. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/