Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: multiple regression, r squared and normality of residuals
From
Richard Goldstein <[email protected]>
To
[email protected]
Subject
Re: st: multiple regression, r squared and normality of residuals
Date
Wed, 23 Mar 2011 08:43:52 -0400
I see several issues here which I touch on prior to providing an
"answer" to the R-squared aspect:
1. an outcome variable and a (non-linearly) transformed version of that
outcome (e.g., log) cannot be compared re: R-squared unless you use a
special version of R-squared (see below)
2. in general R-squareds on different N's are not comparable
3. why did N drop? if real zeros now became missing, you have added problems
4. if the residuals are normally distributed without the transform, why
transform? (certain answers to this question would turn one to -glm-
with log link)
5. if you really want to compare R-squared values for different versions
of the "same" outcome, there are ways to do it; as the (co-)author of at
least two of these, I recommend -brsq- (use -findit brsq-) to find the
program
Rich
On 3/22/11 9:11 PM, Arti Pandey wrote:
>
> Hello
>
> I ran multiple regression with in stata using two models;
> the first gave an R-squared of .35, p values of all predictors was less than
> 0.001 except one which was less than 0.05. No. of obs. used was 84,
> distribution of residuals was normal.
> Then I did a log transform of the dependent variable, r squared went up to .65,
> p values for all predictors was 0.001 except the one mentioned above, which is
> now 0.06. The residuals were also slightly skewed to the left. No. of obs went
> down to 77.
> My question is how do I decide between the R squared and distribution of
> residuals. Is such a high rise in R squared worth sacrificing no of observations
>
> and normal distribution of residuals for. Since the skew is not very pronounced,
>
> it is tempting to go with the second, but then the regression model might be
> wrong.....
> Appreciate any help.
> Arti
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/