Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Differing results from LR-test for -sem- and -overid- for -reg3- and -ivreg2-
From
Suyin Chang <[email protected]>
To
[email protected]
Subject
Re: st: Differing results from LR-test for -sem- and -overid- for -reg3- and -ivreg2-
Date
Thu, 16 Jan 2014 02:36:34 -0500
Dear professor Antonakis,
You are right, of course. I found the difference: the model fitted
with -sem- is non-recursive but without specifying correlation between
the error of the two endogenous variables. While the 3SLS from -reg3-
seems to assume such correlation by default - and as far as I could
find, it doesn't allow to constrain such covariance to be zero. Is it
possible to do so, in order to properly compare the results from MLE
and 3sls regarding the validity of instruments?
Additionally, it is in any case interesting that the instruments pass
the tests when the system is fitted with 2sls.
Thanks for your time!
2014/1/15 John Antonakis <[email protected]>:
> Hi:
>
> Your DFs are not the same, so you are not testing the same model with the
> different estimators.
>
>
> Best,
> J.
>
> __________________________________________
>
> John Antonakis
> Professor of Organizational Behavior
> Director, Ph.D. Program in Management
>
> Faculty of Business and Economics
> University of Lausanne
> Internef #618
> CH-1015 Lausanne-Dorigny
> Switzerland
> Tel ++41 (0)21 692-3438
> Fax ++41 (0)21 692-3305
> http://www.hec.unil.ch/people/jantonakis
>
> Associate Editor:
> The Leadership Quarterly
> Organizational Research Methods
> __________________________________________
>
>
> On 15.01.2014 12:00, Suyin Chang wrote:
>>
>> Dear statalist,
>>
>> Still in my quest to find appropriate tests for overidentying
>> restriction for my nonrecursive simultaneous equation models, and as a
>> follow-up question to the great discussion lead by professor Antonakis
>> here in the list last year
>> (http://www.stata.com/statalist/archive/2013-07/msg00524.html), I got
>> an odd result that requires some insight from people more well versed
>> in the advanced mechanics of those estimations.
>>
>> The same model, estimated with default standard error structure, with
>> 2SLS (via -ivreg2-), 3SLS (via -reg3- plus -overid-) and MLE (via
>> -sem-) yields the following differing overidentifying restriction
>> tests results.
>>
>> For the 2SLS estimation (via -ivreg2-):
>>
>> with -estat overid-
>> eq1)
>> Sargan statistic (overidentification test of all instruments):
>> 1.136
>> Chi-sq(1) P-val
>> = 0.2865
>> eq2)
>> Sargan statistic (overidentification test of all instruments):
>> 0.636
>> Chi-sq(1) P-val
>> = 0.4252
>> with -overid-
>> Number of equations : 2
>> Total number of exogenous variables in system : 9
>> Number of estimated coefficients : 16
>> Hansen-Sargan overidentification statistic : 2.470
>> Under H0, distributed as Chi-sq(2), pval = 0.2909
>>
>>
>> For 3SLS estimation (via -reg3- plus -overid-):
>>
>> Number of equations : 2
>> Total number of exogenous variables in system : 9
>> Number of estimated coefficients : 16
>> Hansen-Sargan overidentification statistic : 1.964
>> Under H0, distributed as Chi-sq(2), pval = 0.3745
>>
>>
>> For the MLE estimation (via -sem-):
>>
>> LR test of model vs. saturated: chi2(3) = 12.30, Prob > chi2 =
>> 0.0064
>>
>>
>> What sort of issue could cause such a different pattern of results?
>> What worries me a bit is that I usually prefer the MLE, but Stata does
>> not have any other overidentifying restrictions test besides the LR
>> chi2 against the saturated model, so to allow an additional proof.
>> Those results are quite disturbing specially because 3SLS is differing
>> from MLE.
>>
>> Please, any insights will be very welcome. I can even send part of the
>> data to be tested, if someone is willing to double-check.
>>
>> Thanks for your time,
>>
>> Suyin
>>
>>
>>
>> >From John Antonakis <[email protected]>
>>>
>>> To [email protected]
>>> Subject Re: st: Goodness-of-fit tests after -gsem-
>>> Date Sun, 14 Jul 2013 11:47:41 +0200
>>> ________________________________
>>>
>>> Hi Jenny:
>>>
>>> The following may be interesting to you:
>>>
>>> Rhemtulla, M., Brosseau-Liard, P. É., & Savalei, V. 2012. When can
>>> categorical variables be >>treated as continuous? A comparison of robust
>>> continuous and categorical SEM estimation >methods under suboptimal
>>> conditions. Psychological Methods, 17(3): 354-373. If your substantive
>>> >results are similar I guess you could treat the variables as continuous,
>>> particularly if you have a >nonsignificant chi-square test. Though, at this
>>> time, I would always double check my results with >Mplus, which has a robust
>>> overid test. That will be so cool when Stata adds the overid statistic >for
>>> -gsem- and also for models estimated with vce(robust).
>>>
>>> Best,
>>> J.
>>>
>>>
>>> __________________________________________
>>>
>>> John Antonakis
>>> Professor of Organizational Behavior
>>> Director, Ph.D. Program in Management
>>>
>>> Faculty of Business and Economics
>>> University of Lausanne
>>> Internef #618
>>> CH-1015 Lausanne-Dorigny
>>> Switzerland
>>> Tel ++41 (0)21 692-3438
>>> Fax ++41 (0)21 692-3305
>>> http://www.hec.unil.ch/people/jantonakis
>>>
>>> Associate Editor
>>> The Leadership Quarterly
>>> __________________________________________
>>>
>>> On 13.07.2013 22:30, Bloomberg Jenny wrote:
>>>
>>> Hi John,
>>>
>>> That is very informative and helpful indeed. Thank you so much. I
>>> liked the Karl Joreskög anecdote.
>>> I've always had an impression that a chi-square test of sem is
>>> actually of little practical use because it always gives significant
>>> results when the sample size is rather large. So I used to ignore
>>> chi-squares and rather look at other goodness-of-fit indexes when
>>> using sem.
>>> I will also look into your articles.
>>>
>>> In the meanwhile, could I ask you two more related questions?:
>>> (1) For now, chi-square statistics is not available with non-linear
>>> gsem. Then, to what extent do you think it makes sense to refer to the
>>> result of chi-square statistics of corresponding linear sem, that is
>>> to assume all the relevant variables are continuous (belonging to
>>> Gaussian family with Identity link) when they are actually not?
>>> Of course it depends on how the actual (g)sem model would look like,
>>> but let's now think of a very simple case, say, a measurement model
>>> with three binary outcomes x1-x3 and a latent variable L which
>>> measures x1-x3. If you use -gsem- and correctly specify -x1 x2 x3<-L,
>>> logit-, then you won't be able to obtain a chi-square statistics.
>>> However, if you "relax" the non-linearity and let x1-x3 pretend to be
>>> continuous variables, then you can obtain a chi-square statistics by
>>> using -sem-. My question is that what kind of implications on the
>>> actual non-linear (binary) model, if any, could we draw from thus
>>> obtained chi-square.
>>>
>>> (2) Suppose we submit to a journal a paper in which stata's gsem was
>>> developed. Then how do you think could/should the referee judge the
>>> paper, that is, judge if the model makes sense etc, without a
>>> chi-square test (or any other indexes)? (yes, I'm assuming that the
>>> Yuan-Bentler style chi-square test you mentioned is yet to be
>>> implemented at that time.) I guess researchers won't feel like using
>>> stata's gsem before this point is resolved.
>>>
>>>
>>> Best,
>>> Jenny
>>>
>>>
>>>
>>> 2013/7/13 John Antonakis <[email protected]>:
>>>
>>> Hi Jenny:
>>>
>>> At this time, and based on my asking the Tech. support people at Stata,
>>> the
>>> overidentification test (and here I mean the likelihood ratio test, or
>>> chi-square test) is not available for -gsem-, which is unfortunate, but
>>> understandable. This is only version 2 of -sem- and the program is really
>>> very advanced as compared to other programs when they were on version 2
>>> (AMOS will is on version a zillion still can't do gsem, for example).
>>> From
>>> what tech. support told me, it is on the wishlist and hopefully we will
>>> have
>>> a Yuan-Bentler style chi-square test for models estimated by gsem, like
>>> Mplus does.
>>>
>>> As for assessing fit, you only need the chi-square test--indexes like
>>> RMSEA
>>> or CFI don't help at all. I elaborate below on an edited version of what
>>> I
>>> had written recently on SEMNET on this point (in particular see the
>>> anecdote
>>> about Karl Joreskog, who as you may know, was instrumental in developing
>>> SEM, about why approximate fit indexes were invented):
>>>
>>> "At the end of the day, science is self-correcting and with time, most
>>> researchers will gravitate towards some sort of consensus. I think that
>>> what
>>> will prevail are methods that are analytically derived (e.g., chi-square
>>> test and corrections to it for when it is not well behaved) and found to
>>> have support too via Monte Carlo. With respect to the latter, what is
>>> funny--well ironic and hypocritical too--is that measures of approximate
>>> fit
>>> are not analytically derived and the only support that they have is via
>>> what
>>> I would characterize as weak Monte Carlo's--which in turn are often
>>> summary
>>> dismissed---by the very people who use ignore the chi-square test--when
>>> the
>>> Monte Carlos provide evidence for the chi-square test.
>>>
>>> We have the following issues that need to be correctly dealt with to
>>> ensure
>>> the model passes the chi-square test (and also that inference is
>>> correct--i.e., with respect to standard errors):
>>>
>>> 1. low sample size to parameters estimated ratio (need to correct the
>>> chi-square)
>>> 2. non-multivariate normal data (need to correct the chi-square)
>>> 3. non-continuous measures (need to use appropriate estimator)
>>> 4. causal heterogeneity (need to control for sources of variance that
>>> render
>>> relations heterogenous)*
>>> 5. bad measures
>>> 6. incorrectly specified model (i.e., the causal structure reflects
>>> reality
>>> and all threats to endogeneity are dealt with).
>>>
>>> Any of these or a combination of these can make the chi-square test fail.
>>> Now, some researchers shrug, in a defeatist kind of way and say, "well I
>>> don't know why my model failed the chi-square test, but I will interpret
>>> it
>>> in any case because the approximate fit indexes [like RMSEA or CFI] say
>>> it
>>> is OK." Unfortunately, the researcher will not know to what extent these
>>> estimates may be misleading or completely wrong. And, reporting
>>> misleading
>>> estimates is, I think unethical and uneconomical for society. That is why
>>> all efforts should be made to develop measures and find models that fit.
>>> At
>>> this time the best test we have is the chi-square test; we can also
>>> localize
>>> misfit via score tests or modification indexes. I will rejoice the day we
>>> find better and stronger tests; however, inventing weaker tests is not
>>> going
>>> to help us.
>>>
>>> Again, here is a snippet from Cam McIntosh's (2012) recent paper on this
>>> point:
>>>
>>> "A telling anecdote in this regard comes from Dag Sorböm, a long-time
>>> collaborator of Karl Joreskög, one of the key pioneers of SEM and creator
>>> of
>>> the LISREL software package. In recounting a LISREL workshop that he
>>> jointly
>>> gave with Joreskög in 1985, Sorböm notes that: ‘‘In his lecture Karl
>>> would
>>> say that the Chi-square is all you really need. One participant then
>>> asked
>>> ‘Why have you then added GFI [goodness-of-fit index]?’ Whereupon Karl
>>> answered ‘Well, users threaten us saying they would stop using LISREL if
>>> it
>>> always produces such large Chi-squares. So we had to invent something to
>>> make people happy. GFI serves that purpose’ (p. 10)’’.
>>>
>>> With respect to the causal heterogeneity point, according to Mulaik and
>>> James (1995, p. 132), samples must be causally homogenous to ensure that
>>> ‘‘the relations among their variable attributes are accounted for by the
>>> same causal relations.’’ As we say in our causal claims paper (Antonakis
>>> et
>>> al, 2010), "causally homogenous samples are not infinite (thus, there is
>>> a
>>> limit to how large the sample can be). Thus, finding sources of
>>> population
>>> heterogeneity and controlling for it will improve model fit whether using
>>> multiple groups (moderator models) or multiple indicator, multiple causes
>>> (MIMIC) models" (p. 1103). This issues is something that many applied
>>> researchers fail to understand and completely ignore.
>>>
>>> References:
>>> *Antonakis J., Bendahan S., Jacquart P. & Lalive R. (2010). On making
>>> causal
>>> claims: A review and recommendations. The Leadership Quarterly, 21(6),
>>> 1086-1120.
>>>
>>> Bera, A. K., & Bilias, Y. (2001). Rao's score, Neyman's C(α) and Silvey's
>>> LM
>>> tests: an essay on historical developments and some new results. Journal
>>> of
>>> Statistical Planning and Inference, 97(1), 9-44.
>>>
>>> *Bollen, K. A. 1989. Structural equations with latent variables. New
>>> York:
>>> Wiley.
>>>
>>> *James, L. R., Mulaik, S. A., & Brett, J. M. 1982. Causal Analysis:
>>> Assumptions, Models, and Data. Beverly Hills: Sage Publications.
>>>
>>> *Joreskog, K. G., & Goldberger, A. S. 1975. Estimation of a model with
>>> multiple indicators and multiple causes of a single latent variable.
>>> Journal
>>> of the American Statistical Association, 70(351): 631-639.
>>>
>>> McIntosh, C. (2012). Improving the evaluation of model fit in
>>> confirmatory
>>> factor analysis: A commentary on Gundy, C.M., Fayers, P.M., Groenvold,
>>> M.,
>>> Petersen, M. Aa., Scott, N.W., Sprangers, M.A.J., Velikov, G., Aaronson,
>>> N.K. (2011). Comparing higher-order models for the EORTC QLQ-C30. Quality
>>> of
>>> Life Research. Quality of Life Research, 21(9), 1619-1621.
>>>
>>> *Muthén, B. O. 1989. Latent variable modeling in heterogenous
>>> populations.
>>> Psychometrika, 54(4): 557-585.
>>>
>>> *Mulaik, S. A. & James, L. R. 1995. Objectivity and reasoning in science
>>> and
>>> structural equation modeling. In R. H. Hoyle (Ed.), Structural Equation
>>> Modeling: Concepts, Issues, and Applications: 118-137. Thousand Oaks, CA:
>>> Sage Publications.
>>>
>>> And, here are some examples from my work where the chi-square test was
>>> passed (and the first study had a rather large sample)--so I don't live
>>> in a
>>> theoretical statistical bubble:
>>>
>>> http://dx.doi.org/10.1177/0149206311436080
>>> http://dx.doi.org/10.1016/j.paid.2010.10.010
>>>
>>> Best,
>>> J.
>>>
>>> P.S. Take a look at the following posts too by me on these points on
>>> Statalist.
>>>
>>> http://www.stata.com/statalist/archive/2013-04/msg00733.html
>>> http://www.stata.com/statalist/archive/2013-04/msg00747.html
>>> http://www.stata.com/statalist/archive/2013-04/msg00765.html
>>> http://www.stata.com/statalist/archive/2013-04/msg00767.html
>>>
>>> __________________________________________
>>>
>>> John Antonakis
>>> Professor of Organizational Behavior
>>> Director, Ph.D. Program in Management
>>>
>>> Faculty of Business and Economics
>>> University of Lausanne
>>> Internef #618
>>> CH-1015 Lausanne-Dorigny
>>> Switzerland
>>> Tel ++41 (0)21 692-3438
>>> Fax ++41 (0)21 692-3305
>>> http://www.hec.unil.ch/people/jantonakis
>>>
>>> Associate Editor
>>> The Leadership Quarterly
>>> __________________________________________
>>>
>>>
>>> On 13.07.2013 08:41, Bloomberg Jenny wrote:
>>>
>>> Hello,
>>>
>>> I have a question about goodness-of-fit tests with gsem. (I don't have
>>> any specific models in mind; it's a general question.)
>>>
>>> I'm now reading the Stata 13 manual, and noticed that postestimation
>>> commands such as -estat gof-, -estat ggof-, and -estat eqgof- can only
>>> be used after -sem-, and not after -gsem-. This means that
>>> goodness-of-fit statistics like RMSEA cannot be obtained when you use
>>> gsem.
>>>
>>> Then, how can I test goodness-of-fit if I use -gsem- to analyse a
>>> non-linear, generalized SEM with latent variables?
>>>
>>> I know that AIC and BIC are still available after -gsem- (by -ic-
>>> option), but they are not for judging fit in absolute terms but for
>>> comparing the fit of different models. What I'd like to know is if
>>> there are any practical ways to judge the goodness-of-fit of the model
>>> in absolute terms.
>>>
>>> Any suggestions will be greatly appreciated.
>>>
>>>
>>> Best,
>>> Jenny
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>>>
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>>>
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/