Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: getting realistic fitted values from a regression
From
"Nick Cox" <[email protected]>
To
<[email protected]>
Subject
RE: st: getting realistic fitted values from a regression
Date
Fri, 23 Jul 2010 16:06:06 +0100
Thanks for the commendation.
It is easy enough to try the -glm- approach _and_ other fixes and to
compare results.
I have found that they give very similar answers in practice. What all
can agree on is that some kind of fix is needed when your real interest
is predicting on the original scale and a log scale -- or indeed any
other nonlinear transform or link -- was used for the response in
modelling.
Nick
[email protected]
David Jacobs
Maarten states the received wisdom on this issue, but in the
econometrics text authored by Jeffrey Wooldridge (Introductory
Econometrics Thompson-Southwestern 2003 ) on pp. 208-9 Wooldridge
suggests a way to obtain unlogged predictions from a regression in
which the regressand is in log form (there have been subsequent
editions of this book but the page numbers I give will be close in
those newer editions). If one of the statistical experts on this
list is familiar with this approach or is willing to look it up, I'd
be interested in their reaction.
That said, I wholeheartedly agree with Maarten's recommendation. I
found the article he suggests by Cox et al. to be extremely useful
and I'm grateful to him for suggesting it on another occasion.
David Jacobs
At 03:08 AM 7/22/2010, you wrote:
>--- On Wed, 21/7/10, Woolton Lee wrote:
> > I have estimated a regression (OLS) using log of patient
> > travel distance to a hospital predicted by patient, hospital
> > and area characteristics. I am going to report the results
> > as marginal effects that I've computed by obtaining
> > predictions from my estimated regression computed by fixing
> > some variables and keeping others at their original values.
> > However after I compute the predictions, I am getting
> > unrealistically large numbers. When I examined the regression
> > residuals it looks as though the obs with unrealistic fitted
> > values have larger residuals. Is there a way to adjust the
> > regression to better account for this problem?
>
>If you want to predict the travel distance you should use
>-glm- with -link(log)- option rather than use -regress- on
>a log transformed dependent variable. The difference is that
>with the former you are modeling log(E(y)), while in the latter
>you are moddeling E(log(y)). If you want to backtransform your
>predictions using the antlog transformation you will get
>exp(log(E(y))) = E(y) for the -glm- command, while after -regress
>you get exp(E(log(y))) != E(y). A nice discussion on this issue
>can be found in:
>
>Nicholas J. Cox, Jeff Warburton, Alona Armstrong, Victoria J. Holliday
>(2007) "Fitting concentration and load rating curves with generalized
>linear models" Earth Surface Processes and Landforms, 33(1):25--39.
><http://www3.interscience.wiley.com/journal/114281617/abstract>
>
>There exist approximations you can use after -regress- to fix
>this problem, by why try to fix a problem if you can easily prevent
>it?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/