Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Cameron McIntosh <cnm100@hotmail.com> |
To | STATA LIST <statalist@hsphsun2.harvard.edu> |
Subject | RE: st: Hosmer-Lemeshow and other Pseudo Rsquares |
Date | Mon, 14 May 2012 21:08:49 -0400 |
Joseph, In order to be able to provide an optimally informed write-up to reviewers, you should also have a look at: DeMaris, A. (2002). Explained Variance in Logistic Regression: A Monte Carlo Study of Proposed Measures. Sociological Methods & Research, 31(1), 27-74.http://gabarrot.psychologie-sociale.org/documents/DM2002.pdf Menard, S. (2000). Coefficients of Determination for Multiple Logistic Regression Analysis. The American Statistician, 54(1), 17-24. Liao, J.G., & McGee, D. (2003). Adjusted Coefficients of Determination for Logistic Regression. The American Statistician, 57(3), 161-165. Mittlböck, M., & Schemper, M. (1996). Explained variation for logistic regression. Statistics in Medicine, 15(19), 1987-1997. Mittlböck, M. (1998). Computing measures of explained variation for logistic regression models. Computer Methods and Programs in Biomedicine, 58(1), 17-24. Allen, J., & Le, H. (2008). An Additional Measure of Overall Effect Size for Logistic Regression Models. Journal of Educational and Behavioral Statistics, 33(4), 416-441. Heinz, H., Waldhor, T., & Mittlböck, M. (2005). Careful use of pseudo R-squared measures in epidemiological studies. Statistics in Medicine, 24(18), 2867-2872.http://www.meduniwien.ac.at/msi/biometrie/publikationen/Separata/Heinzl_Waldhoer_Mittlboeck_2005_SiM.pdf Cameron, A.C., & Windmeije, F.A.G. (1997). An R-squared measure of goodness of fit for some common nonlinear regression models. Journal of Econometrics, 77(2), 329-342.http://cameron.econ.ucdavis.edu/research/je97preprint.pdf Veall, M.R. & Zimmermann, K.F. (1996). Pseudo-R2 Measures for Some Common Limited Dependent Variable Models. Journal of Economic Surveys, 10(3), 241-259. Cam > Date: Mon, 14 May 2012 10:45:45 -0400 > Subject: Re: st: Hosmer-Lemeshow and other Pseudo Rsquares > From: josephpadgett@gmail.com > To: statalist@hsphsun2.harvard.edu > > Yeah, I second the 'tribal' thing. I've been largely learning a lot > of this on my own in order to be thorough on this particular project > and from one discipline's literature to the next the terminology alone > is night day, never mind the exact method and type of reporting they > prefer. > > I believe I'm on track now. Thanks for the suggestions! > > On Mon, May 14, 2012 at 10:43 AM, Nick Cox <njcoxstata@gmail.com> wrote: > > I find these things to be highly tribal. One large part of the > > statistical world doesn't know at all about what another large part > > regards as utterly standard. So, anything that might surprise your > > reviewers might need to be explained very carefully. > > > > On Mon, May 14, 2012 at 3:34 PM, Joseph Padgett <josephpadgett@gmail.com> wrote: > >> Thanks, Nick. That's helpful. I've seen these suggestions before, but > >> wrapped in bigger discussions and not nearly as succinct. > >> > >> I am aware that the R square measures for logistic models are only > >> guides and not sole determining factors, but it seems that researchers > >> commonly report some form of it (sociology background here btw). > >> > >> So I've calculated both of your suggestions. Any advice on reporting > >> those? Does either have an associated line of research that you're > >> aware of that I should be referring to/citing when I'm reporting the > >> calculation and results? > >> > >> On Mon, May 14, 2012 at 10:10 AM, Nick Cox <njcoxstata@gmail.com> wrote: > >>> I suggest a few meta-rules for yourself: > >>> > >>> 1. Whatever you calculate should be defined and calculated > >>> consistently across different models. > >>> > >>> 2. Whatever you calculate you promise to use with extreme caution > >>> always flagging precisely how it is calculated. > >>> > >>> 3. You don't decide which model is "best" from these measures; you > >>> just treat them as descriptive statistics. > >>> > >>> #1 sounds easy but can bite quite hard. I find the idea of R^2 as > >>> > >>> square of correlation between observed and predicted > >>> > >>> as the sense of R^2 that I like best but this grows out of a long > >>> personal history of working with correlation and regression and one > >>> that is dominated by working with continuous outcomes. People with a > >>> long history the other way round might want you to look for > >>> > >>> 1 - (log likelihood for model) / (log likelihood for same model with > >>> only a constant term) > >>> > >>> and could have similar warm feelings for that. Others would find the > >>> whole idea of looking at goodness of fit without also assessing number > >>> of parameters or model complexity in general quite misguided, but > >>> those others can't agree on which of various *IC you should use, and > >>> even those who have a favourite often say, "You should use ?IC except > >>> that it usually favours over-simplified models" or some such. > >>> > >>> On the first option see > >>> > >>> FAQ . . . . . . . . . . . . . . . . . . . . . . . Do-it-yourself R-squared > >>> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox > >>> 9/03 How can I get an R-squared value when a Stata command > >>> does not supply one? > >>> http://www.stata.com/support/faqs/stat/rsquared.html > >>> > >>> On Mon, May 14, 2012 at 2:47 PM, Joseph Padgett <josephpadgett@gmail.com> wrote: > >>> > >>>> I am working with a data set where students are nested within school. > >>>> > >>>> I have completed a thorough run of models starting with nulls and > >>>> ending with full fixed- and random-effects with all controls and > >>>> predictors and several models in between with various combinations of > >>>> controls. My dependent variable is a binary outcome. > >>>> > >>>> I have Haussman tests, LR tests, and Wald taken care of, but I would > >>>> like to report some goodness-of-fit results for my models. I am aware > >>>> of the Hosmer-Lemeshow test statistic and it's interpretation, but I'm > >>>> having a difficult time finding out how to compute it from my model > >>>> results. I would also like to consider alternatives such as Cox and > >>>> Snell. > >>>> > >>>> I have run my models with each of xtlogit, xtmelogit, and gllamm. I > >>>> did this mostly to be able to learn a bit about the post estimation > >>>> commands and different options with each command. That being said, I > >>>> don't know how to get the pseudo Rsquare measures after any of these > >>>> and most explanations that I find refer only to the logit command and > >>>> give examples using very simplistic models. > >>>> > >>>> I'm fairly certain there's something terribly obvious that I'm > >>>> overlooking. Any help would be greatly appreciated. > >>> > >>> * > >>> * For searches and help try: > >>> * http://www.stata.com/help.cgi?search > >>> * http://www.stata.com/support/statalist/faq > >>> * http://www.ats.ucla.edu/stat/stata/ > >> > >> * > >> * For searches and help try: > >> * http://www.stata.com/help.cgi?search > >> * http://www.stata.com/support/statalist/faq > >> * http://www.ats.ucla.edu/stat/stata/ > > > > * > > * For searches and help try: > > * http://www.stata.com/help.cgi?search > > * http://www.stata.com/support/statalist/faq > > * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/