Stas Kolenikov wrote:
"-I tried Rasch
model with -xtlogit- (or -xtmelogit-), and the results were
essentially a slightly curved version of the total number correct."
That's what the Rasch model does as total score is a sufficient statistic for ability. The bending is often useful, though, to show more appropriately how spread the subjects realy are and the fact that the estimated abilities have a more realistic SE than arises fro classical test theory.
The local independence issue puzzles me but I'll hazard a guess. I suspect that very extreme difficulty items will tend to be multidimensional, which would be interpreted as a violation of local independence if one asserts that there is only one latent dimension. One of the reasons for the development of IRT was to handle the problem of "difficulty factors" which occur due to the fact that covariances of binary items are restricted by the marginals, which induces an illusory multidimensionality. An IRT model helps a lot, but extreme items might well not fit on the continuum defined by the other items. For example, on a math test, the hard items may require a qualitative shift in reasoning ability. (There are models for this kind of thing, eg., Mark Wilson's saltus model.)
JV
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/