Stas,
I think your problem of convergence is a consequence of your small
sample size. You have only N=40 subjects for J=30 items. Generally in
IRT, we consider to be in good conditions if J<<N. In order to achieve
this requierement, sample size are generally important (2000-30000
subjects) if the test (questionnaire) is long (typically in Educational
Sciences) or if the sample size is small (100-300 individuals), the test
is short (J=5-10) (typically in Health Sciences).
I think your sample is too small to envisage a complex IRT models like
the 2 parameters logictic model (2PLM or Birnbaum model) (60
parameters=30 discriminating powers (factor loadings) minus 1
(identifiability constraint), 30 difficulty parameters (fixed effects),
and the variance of the latent variable (which generally is not fixed to
one). Even for the Rasch model (1PLM) which consider only 31 parameters
(30 difficulty parameters and the variance of the latent variable), your
sample is small !!
I have try to simulate a sample with 40 individuals and 30 items and to
estimate the parameters with a Rasch model (-simirt- and -raschtest- I
use are available on SCC, -raschtest- is developped in the Stata Journal
2007 (Hardouin, Rasch analysis : Estimation and tests with the raschtest
module, The Stata Journal, 7(1): 22-44)).
. simirt, dim(30) nbobs(40) clear
. raschtest item*, meth(mml)
The precision of the difficulty parameters are very bad (s.e.>0.4) and
so, the obtained measures (wich are usually the aim on such an analysis)
on the latent trait are very imprecise.
Using the CML estimation technics (-raschtest item*, meth(mml)- which
use the clogit module) don't allow improving the estimations quality.
For most of psychometricians, the Rasch model (and its polytomous
extensions like the Rating scale model or the Partial Credit Model) is
the only one (IRT) model which allows obtaining an objective measure (a
measure independent of the sample, and independent of the responded
items), so the others IRT models are not recommanded. Generally, we
don't obtain a better measure with a complex IRT model than by using the
classical score computed as the number of correct responses. A complex
IRT model can only be a way to understand the items functionning (is a
guessing effect, a strong discrimination power...). So I always
recommand to use the Rasch model in a first intention.
I suppose you have too an other problem which is the violation of the
local independence assumption: in IRT, conditionnaly to the latent
variable, the responses to the items must be independent. You say that
there is items with a little number of negative (incorrect) responses,
and the phenomenon can create a violation of this assumption. Testing
the fit of a Mokken scale (non parametric IRT model) can detect such a
violation. This model can be fit with the -loevH- command (available on
SSC) [I can send you a paper submit to the Stata Journal which explain
the outputs of this command].
Concerning the fit of a 2PLM, if you have SAS, you can easily test the
convergence of the estimations by using the %anaqol macro-program
(available on http:\\www.anaqol.org). This macro use the NLMIXED
procedure which is based on the same technics of estimation than
-gllamm-, so these two procedure are comparable (even on the computing
time, usually very long !!!!!).
I hope this helps !
Jean-Benoit
Joseph Coveney a écrit :
> I'm not sure what kind of convergence problems yexperiencing with
> -gllamm-. Is it just slowness? With the two-parameter model, my understanding
> is that you'd be fitting 30 random effects--something that would require a great
> deal of patience with -gllamm- at least with more than a few integration points
> and without multiple processors.
>
> There are some examples of these kinds of models fitted with -gllamm- in Xiaohui
> Zheng & Sophia Rabe-Hesketh. (2007) Estimating parameters of dichotomous and
> ordinal item response models with gllamm. _The Stata Journal_ 7(3):313-33. They
> limit themselves to a relative few test items, nowhere near 30.
>
> As far as fitting an analogous model with -xtmelogit-, couldn't you set up an
> equation on the random effects side of the double-pipe for student-by-test item
> interaction terms (the 30 random effects)? It would seem that the common tactic
> of omitting the first test item in the random effects equation (omitting it from
> the equation as the constant) identifies the model by fixing the first test
> item's loading factor (allowing the variance for the random effect for students
> to be free).
>
> I think that traditionally with IRT models, the random effects for students
> would be constrained to unit variance, which allows for all of the item factor
> loadings to be estimated (free)--they're held to be equal for the Rasch model (a
> single random effect, fitted with -xtlogit- as Jay mentions and as you show
> below) and allowed to be independently estimated in the two-parameter model.
> You can't impose such a unit-variance constraint with -xtmelogit-, but wouldn't
> -xtmelogit- still allow for at least an analogous model to be fitted by fixing
> one item's loading factor (omitted as the constant), which scales the student
> random effect to it? Specifying the student-by-test item interactions would
> follow the same random-effects equation syntax with -xtmelogit- as for an
> analogous interaction term (fixed test item-by-random student) in a mixed-model
> ANOVA fitted with -xtmixed-.
>
> Joseph Coveney
>
> Stas Kolenikov wrote:
>
> I see. Since I am not really sure where I want this to get shrunk, I
> probably won't be trying these quasi-Bayesian routes. (Writing
> MataBUGS can be an exciting year-long project on its own though :))
> I'll go over my list of questions, and restrict the parameters to be
> equal if the same (low) number of students have missed those
> questions. That might kill my sensitivity parameter on these questions
> though.
>
> For -xtmelogit-, it looks to me like the covariance structure you need
> is the matrix of ones... which is kinda stupid to deal with. Anyway I
> hope Bobby G would chime in.
>
> The Rasch model would then be just
>
> g byte ones = 1
> eq ones : ones
> gllamm Correct [question dummies], ... eq( ones )
>
> or
>
> xtmelogit Correct [question dummies], nocons || studentID :
>
> so that the random factor weighs equally on all questions, right?
>
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/