Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: How to compare performance (goodness-of-fit) of very different modelling approaches?


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: RE: How to compare performance (goodness-of-fit) of very different modelling approaches?
Date   Fri, 16 Jan 2009 17:57:16 -0000

Weird distributions are quite possible, but likely to be a secondary
issue. 

First, I take it as standard that some kind of scaling, standardisation
and/or transformation may be a good idea for the residuals, and perhaps
also the fitted (predicted) response. 

Second, the key idea with residual plots (shorthand here for f(residual)
vs g(fitted), where f() and g() are functions, possibly the identity
function) is whether there is systematic structure not otherwise
explicable as a consequence of what you are doing. 

Third, do different models produce similar plots or different ones? 

Fourth, if you smooth, i.e. get nice_mean(f(residuals) | g(fitted)), is
there are structure in that smooth or is it flat with random wiggles? 

Weirdness, however you define that statistically, comes lower down the
list. 

A different thought is that just plotting some measure of goodness of
fit vs some measure of complexity might give you a handle for comparing
models. 

Nick 
[email protected] 

Eva Poen

Thank you very much, Nick, for your excellent comments. The bias and
the number of parameters are exactly what concerns me. And thanks for
the reference (shame on me for not finding it in the faq myself!). I
have to admit that I slightly struggle with the idea of 'residuals' in
these models. Due to the boundedness of the outcome variable and the
complicated structure of the models, the technical residuals are going
to have some very weird distribution. But I'll give it a try.

2009/1/15 Nick Cox <[email protected]>:

> An objection in principle to correlation, meaning correlation between
> observed and fitted response, is that it ignores bias. Thus a model
that
> always predicted 2y for response y would yield a correlation of 1,
> ignoring the bias. That is one reason for preferring to use
concordance
> correlation, as implemented e.g. in -concord- from the SJ.
>
> In practice in problems like these gross bias appears rare and when it
> occurs attributable to some major programming error. So correlation
(and
> indeed its square) remains fairly attractive. There is a nice paper by
> Zheng and Agresti pointing out its simple virtues. The reference, and
> some other comments, are in
>
> http://www.stata.com/support/faqs/stat/rsquared.html
>
> Another objection to correlation is naturally that it takes no account
> of model complexity and so we have various criteria that penalise for
> the number of parameters (even though complexity has more dimensions
> than that). You don't state your precise objections, but I often
> encounter statements of the form "?IC is known to favour models that
are
> too complicated" (or "too simple"), but less often see explanations of
> what equally objective criteria tell the researcher that is so, or
> explanations of why researchers use criteria they believe to be
> systematically flawed.
>
> My main positive suggestion is to suggest adding a graphical dimension
> to model assessment. I take it that your response is essentially
> continuous, in which case the best single kind of graph I suggest to
be
> a graph of residual versus fitted. Smoothing that in some way can spot
> systematic structure missed by the model. Even if the models look
> equally good (or bad) you will then have the interesting task of
> discussing which model makes most scientific (in your case economic)
> sense.
>
> Nick
> [email protected]
>
> Eva Poen
>
> currently I am working on slightly complicated mixture models for my
> data. My outcome variable is bounded between 0 and 20, and has mass at
> either end of the interval. Whether or not I analyse the data on the
> original [0,20] scale or a transformation to [0,1] (fractions) does
> not make any difference to me.
>
> My question concerns the goodness of fit. I would like to compare the
> goodness fit of the complicated finite mixture model to much simpler
> models, e.g. the tobit model, the glm model. and a hurdle
> specification. Since the likelihood values of these models differ
> substantially, likelihood based measures such as BIC appear to be
> inadequate for the purpose. Also, measures that compare the model
> likelihood of the fitted model to the null likelihood ("pseudo r2")
> are difficult sine I can calculate them for the tobit and glm models,
> but not for the mixture model, as it is unclear what the null model
> would be.
>
> So far I have been looking at crude measures like correlation between
> predicted outcome and actual outcome, but I feel that this is
> inadequate, especially since the outcome variable is bounded. I'd be
> grateful for hints and comments. I am working with Stata 9.2.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index