Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Comparison of the R-squared in a loglog and linear model
From
Austin Nichols <[email protected]>
To
[email protected]
Subject
Re: st: Comparison of the R-squared in a loglog and linear model
Date
Fri, 18 Jun 2010 11:17:25 -0400
Kit et al.--
Duan's smearing method is one approach to dealing with a logged
depvar; a better approach is to use a regression technique that
respects the functional form, like -poisson- (or another member of the
-glm- family). But you still cannot compare the R-squared across
non-nested models and hope to conclude anything about which model is
better from that information alone. Mean squared prediction error in
levels for the nonzero outcomes seems a reasonable criterion for
rejecting the log(y) regression model below.
use http://fmwww.bc.edu/ec-p/data/mus/mus03data, clear
qui reg totexp suppins phylim actlim totchr age female income
predict xb
qui reg ltotexp suppins phylim actlim totchr age female income
levpredict tenorm
levpredict teduan, duan print
qui poisson totexp suppins phylim actlim totchr age female income
predict tepois
qui nbreg totexp suppins phylim actlim totchr age female income
predict tenbreg
su totexp xb te*
su totexp xb te* if totexp>0
corr totexp xb te*
g mse_xb=(totexp-xb)^2/1e6
g mse_norm=(totexp-tenorm)^2/1e6
g mse_duan=(totexp-teduan)^2/1e6
g mse_pois=(totexp-tepois)^2/1e6
g mse_nbreg=(totexp-tenbreg)^2/1e6
su mse*
su mse* if totexp>0
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
mse_xb | 2955 127.0504 642.6503 .00005 12779.11
mse_norm | 2955 142.4353 641.0374 3.32e-06 11744.09
mse_duan | 2955 140.7604 644.1605 .0000549 11842.16
mse_pois | 2955 128.3255 648.1356 4.52e-06 12841.78
mse_nbreg | 2955 131.8694 642.3027 2.48e-06 12432.65
For those enamored of scatter plots for this kind of comparison, much
more work is required to get a good picture of fit. This is one
approach:
g cr_te=totexp^(1/3)
g cr_xb=sign(xb)*abs(xb)^(1/3)
g cr_norm=tenorm^(1/3)
g cr_duan=teduan^(1/3)
g cr_pois=tepois^(1/3)
g cr_nbreg=tenbreg^(1/3)
sc cr_* cr_te if totexp>0, msize(1 1 1 1 1 1)
On Fri, Jun 18, 2010 at 9:47 AM, Christopher Baum <[email protected]> wrote:
> <>
> On Jun 18, 2010, at 2:33 AM, Natalie wrote:
>
>> Can I not maybe obtain the antilog predicted values for the log log
>> model and compute the R-squared between the antilog of the observed and
>> predicted values. And then compare this R-square with the R-square
>> obtained from OLS estimation of the linear model?
>>
>> There are other statistical programs that can do this automatically, but
>> as I work with Stata, I'd rather do it with this program.
>
>
> findit levpredict
>
> Generate the level form of the dependent variable (correctly, using this routine) and then
> compute the squared correlation between that and the original level variable. That will be the
> R^2 of the log form of the regression.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/