Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: problem with GLLAMM and a bernoulli mixed model

From	Stas Kolenikov <[email protected]>
To	[email protected]
Subject	Re: st: problem with GLLAMM and a bernoulli mixed model
Date	Wed, 9 Mar 2011 17:47:10 -0500

-gllamm- IS maximum likelihood estimator, so I am not sure as to why
you are making the distinction. As I said, you may have problems with
empirical identification in your particular data set, but other than
that, I cannot think of any other obvious explanation. How many data
points do you have per id?

On Wed, Mar 9, 2011 at 5:24 PM, David Pacheco <[email protected]> wrote:
> Stas,
>
> I've tried to fit another version of the model but without neither
> factor loading nor constraints to the variance of normal latent
> factor, case that I understand is the same who I've describe
> previously (factor loading = std of latent factor). In these case
> happen a similar thing, the variance of the latent factor is
> reasonable but depend a lot of the integration points used. Idem when
> I've changed the link function from probit to logit.
>
> In other hand, the same model and data have been fitted, with success
> by a friend, but he used the traditional Stata's Maximum Likelihood
> Method and the solutions doesn’t have the problem of high sensitivity
> to the cuadrature method and always with much less variance of the
> latent factor than the solution by GLLAMM.
>
> My model is a longitudinal model where the only one latent factor or
> random effect change over the time and  impute the variability, so the
> time "fecha_n" is the "id", while the cross-section is collapsed
> because all the items are infinitely homogeneous with the same success
> probability, so the variable "n_cuotas" and "df_cuotas" contains the
> number of items and the number of success over the time, respectively.
> The model is in the spirit of Vasicek's Model of Credit Risk, where:
> all the loans of an portfolio have the same probability of default;
> the number of loans is very large and all of them are homogeneous
> (assumptions close to my data); and a systematic latent factor over
> the time produces the variability in the portfolio’s probability of
> default.
>
> Why have I been working with GLLAMM?, because I think is more easy,
> than traditional ML on Stata, subsequently generalize the basic model
> including: latent coefficients on covariates;  other links functions;
> and multivariate latent factors approach. But I don't know what is
> wrong in my code or with GLLAMM?, especially with these simple
> binomial mixed models, or what is the source of this positive skewness
> in the variance of latent factor and its sensitivity to the
> integration points?
>
> Any suggestions would be very much appreciated!
>
> 2011/3/9 Stas Kolenikov <[email protected]>:
>> David Pacheco reported some difficulties in getting convergent
>> solutions in -gllamm- with binary dependent variable factor analysis
>> (no covariates) model.
>>
>> I suspect you might have (empirical) identification problems with this
>> model. If you don't really have variability at the second level, then
>> setting the variance parameter to 1 will send your loadings to
>> infinity. Are your likelihoods the same for different models? You
>> would want to try a likelihood ratio against the simple -probit-
>> model, and you would probably want to run -xtprobit, re- with your
>> data, just to see what comes out (it additionally imposes constraint
>> of equal loadings of different items, but if that is at least
>> approximately true, then you will get an estimate of the factor
>> variance from it to gauge how far it is from zero).
>>
>> You would probably want to put in different intercepts for both
>> -gllamm- and -xtmixed- models using
>>
>> tabulate items, gen( item_dummy )
>> gllamm response item_dummy*, nocons ...
>>
>> if you have varying probabilities of success in different items. Thus
>> far, you have imposed an implicit constraint of equal probabilities,
>> and -gllamm- might be trying to accommodate that with wildly varying
>> factor loadings, the only thing you allowed to vary in the model.
>>
>> On Wed, Mar 9, 2011 at 10:40 AM, David Pacheco <[email protected]> wrote:
>>> Hello,
>>>
>>> I'm seeking suggestions about a problem with GLLAMM. I've been working
>>> with a specific and simple Bernoulli mixed model: link probit;
>>> binomial family; 2 levels;  I don’t have covariate in any level; and
>>> in the second level I have only one latent factor with normal
>>> distribution and std=1 plus its factor loading. In general the model
>>> is very simple, with 3 parameters. I've used this code:
>>>
>>> **************!
>>> gen cons=1
>>> eq fech1: cons   ***this allow me to create the equation for the
>>> latent variable that only have a factor loading
>>> constraint def 1 [fec1_1]cons = 1   *** this constraint the std=1 for
>>> the normal factor
>>> gllamm df_cuotas, i(fecha_n) link(probit) family(binom)
>>> denom(n_cuotas) eqs(fech1) constr(1) frload(1)   **** where df_cuotas
>>> is the response and I don't have covariate
>>> **************
>>>
>>> The model looks very simple, but when I've tried with different number
>>> of integration points (like nip(8), ... nip(20), nip(40), etc ) plus
>>> the traditional or adaptative cuadrature, the solution for the factor
>>> loading change a lot, so is very sensitive to the cuadrature setting.
>>>
>>> After that, I've tried to add start values to, maybe, neutralize this
>>> sensitivity to the cuadrature setting. I've used like start values a
>>> skew solution that I know for this model, in this way:
>>>
>>> **************!
>>> matrix list e(b)   *** for see the structure of the parameter matrix
>>>
>>> Stata show me this:
>>>
>>> e(b)[1,3]
>>>     df_cuotas:    fec1_1l:     fec1_1:
>>>         _cons        cons        cons
>>> y1           0         1.1          .5
>>>
>>> copy a=e(b)      *** to copy the structure of the parameter matrix
>>> matrix a[1,1]= -1.1 *** replace the values on matrix "a" with my initial values
>>> matrix a[1,2]= 0.016
>>> matrix a[1,3]= 1
>>> **************
>>>
>>> Thus, I've run the following code:
>>>
>>> **************
>>> gllamm df_cuotas , i(fecha_n) link(probit) family(binom)
>>> denom(n_cuotas) eqs(fech1) constr(1) frload(1) from(a)
>>> **************
>>>
>>> but Stata send me the error:
>>>
>>> ******
>>> initial vector: extra parameter df_cuotas:_cons found
>>> specify skip option if necessary
>>> (error occurred in ML computation)
>>> (use trace option and check correctness of initial model)
>>> ******
>>>
>>> However, the parameter "df_cuotas:_cons" exist in the model and in the
>>> matrix e(b). I thought that I had to delete the parameter "
>>> fec1_1:cons" from the matrix "a" of initial values, because this is
>>> the std. of the latent variable that I've constrained to 1.
>>> Nevertheless, Stata send me the same error.
>>>
>>> My questions:
>>>
>>> 1) Is something wrong on my code or is a common problem in GLLAMM, and
>>> in this kind of models, the sensitivity of loading factor to the
>>> cuadrature setting?...because with every number of integration point
>>> that I've tried the solution of the factor loading has changed a lot
>>>
>>> 2)  What’s  wrong in my code or in my matrix of initial values, when I
>>> try to use "from"?
>>>
>>> Any suggestions would be very much appreciated!
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>
>>
>>
>> --
>> Stas Kolenikov, also found at http://stas.kolenikov.name
>> Small print: I use this email account for mailing lists only.
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>



-- 
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: problem with GLLAMM and a bernoulli mixed model
  - From: David Pacheco <[email protected]>
- Re: st: problem with GLLAMM and a bernoulli mixed model
  - From: Stas Kolenikov <[email protected]>
- Re: st: problem with GLLAMM and a bernoulli mixed model
  - From: David Pacheco <[email protected]>

Prev by Date: Re: st: problem with GLLAMM and a bernoulli mixed model
Next by Date: st: sureg problem - how many parameters can be estimated with a given number of observations
Previous by thread: Re: st: problem with GLLAMM and a bernoulli mixed model
Next by thread: st: predicting consumption
Index(es):
- Date
- Thread