Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Non-linear regression: interpretation
From
Daniel Feenberg <[email protected]>
To
[email protected]
Subject
Re: st: Non-linear regression: interpretation
Date
Tue, 8 Feb 2011 18:56:37 -0500 (EST)
On Tue, 8 Feb 2011, David Greenberg wrote:
It is true that the quadratic term taken by itself can be hard to
interpret. If the linear term is also in the equation, the coefficient
for the quadratic term would seem to be an answer to a question that
cannot have a meaningful answer, namely, how much the dependent variable
changes in response to marginal change in the quadratic term, while
holding the linear term constant. But it is impossible to hold x
constant and allow x-squared to vary. However, the estimated
coefficients of linear and quadratic terms together can be used to
compute the estimated point at which the quadratic equation has a
minimum or maximum, and that is something many researchers might want to
know. One can also compute the value of the dependent variable at the
minimum or maximum. David Greenberg, Sociology Department, New York
University
If one takes the squared term about the mean of the variable, it
contributies nothing at the mean, leaving the linear term alone describing
the effect of changes in the variable about the mean. That can make quick
interpretations of the coeficients possible. For example, if the mean of x
is 7, then define
xx = (x-7)**2
instead of using x**2. This won't change any predictions or t-stats, but
the slope dy/dx at x=7 will just be the coefficent on the linear term for
x - no need to fuss with calculating the contribution of the squared
term.
Daniel Feenberg
----- Original Message -----
From: Maarten buis <[email protected]>
Date: Tuesday, February 8, 2011 4:55 am
Subject: Re: st: Non-linear regression
To: [email protected]
--- On Tue, 8/2/11, Hamizah Hassan wrote:
I would like to run non-linear regression by including the
linear and quadratic functions of the variable.
Typically this is still refered to as a linear model, as the
model is still linear in the parameters.
I just realize that if the variable is in percentage, the
quadratic figure is higher than the linear figure. However,
if it is in decimal, it would be the other way around and
definitely it will effect on the meaning of the results.
The models are mathematically equivalent. You can see that
by looking at the predictions.
Generally, it is hard to give a substantive interpretation to
a quadritic term, regardless of how you scaled the original
variable. If you care about interpreting the coefficients but
still want to allow for non-linear effects, then your best
guess is probably to use linear splines (which confusingly is
actually a non-linear function...)
Consider the example below. The first part shows that the
two quadratic models result in the same predicted values. The
final part displays linear splines as an alternative. The final
graph shows that they result in fairly similar predictions, but
the spline terms can actually be interpreted: the parameter for
fuel_cons1 tells you that for cars with a fuel-consumption of
less than 12 liters/100km an additional liter/100km leads to a
non-significant price increase of 62$ (=.062*1000$). The
parameter for fuel_cons2 tells you that for cars with a fuel
consumption of more than 12 liters/100km an additional liter
per 100 kilometers will lead to a signinicant price increase of
1011$ (=1.011*1000$).
*----------------- begin example -----------------
//================================== first part
sysuse auto, clear
// since I am European and the question is about
// interpretation I first convert mpg from miles
// per gallon to liter / 100 km and price in
// 1000 $
gen fuel_cons = 1/mpg * 3.78541178 / 1.609344 *100
label var fuel_cons "fuel consumption (l/100km)"
replace price = price / 1000
label var price "price (1000$)"
// create a "proportion-like" variable
sum fuel_cons , meanonly
gen prop = ( fuel_cons - r(min) ) / ( r(max) - r(min) )
// take a look at that new variable
spikeplot prop, ylab(0 1 2)
// turn it into percentages
gen perc = prop*100
spikeplot perc, ylab(0 1 2)
// add square terms using the new
// factor variable notation
reg price c.prop##c.prop
predict yhat_prop
reg price c.perc##c.perc
predict yhat_perc
// compare predicted values
twoway function identity = x, ///
range( 13 31 ) lcolor(gs8) || ///
scatter yhat_prop yhat_perc, ///
aspect(1) msymbol(Oh)
//================================== final part
// alternative with interpretable parameters
// create splines
mkspline fuel_cons1 12 fuel_cons2 = fuel_cons
reg price fuel_cons1 fuel_cons2
predict yhat_spline
twoway scatter price fuel_cons || ///
line yhat_prop yhat_spline fuel_cons, ///
sort ytitle("price (1000 {c S|})") ///
legend(order( 1 "observations" ///
2 "prediction," ///
"quadratric" ///
3 "prediction," ///
"spline" ))
*---------------- end example --------------
(For more on examples I sent to the Statalist see:
http://www.maartenbuis.nl/example_faq )
Hope this helps,
Maarten
--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany
http://www.maartenbuis.nl
--------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/