Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Interpreting coefficients of (logX)^2 variable in pooled OLS regression [SEC=UNOFFICIAL]
From
David Hoaglin <[email protected]>
To
[email protected]
Subject
Re: st: Interpreting coefficients of (logX)^2 variable in pooled OLS regression [SEC=UNOFFICIAL]
Date
Fri, 24 May 2013 22:11:04 -0400
Hi, Lucy.
Apart from reporting that the coefficient of ldistsq is positive, I
wonder whether it is necessary to give a separate interpretation to
that coefficient. The relation is between lfare and a function of
ldist (after adjusting for differences among the years), and that
function involves both a quadratic term and a linear term. That is,
the quadratic term and the linear term should be taken as a unit. It
may be helpful to plot the fitted curves relating fare and dist (i.e.,
transform back from the log scale to the data scale), a separate curve
for each year.
I wonder whether the relation of lfare to ldist is truly quadratic.
You have enough data to try a piecewise-constant model: Split the
range of ldist into disjoint intervals, each containing, say at least
50 observations, choose one of those intervals as the reference
category, and create an indicator variable for each of the other
intervals. Then use that set of indicator variables as the
predictors, instead of ldist and ldistsq. If you then plot the
coefficient of each indicator variable against the value of ldist at
the midpoint of its interval, you can get a good impression of the
shape of the nonlinearity. It might, for example, resemble a linear
spline.
Also, what do you see when you look at the relation of lfare to ldist
for each year separately? Would it be helpful to include interactions
with year in your model?
David Hoaglin
On Fri, May 24, 2013 at 9:08 PM, DU,Lucy <[email protected]> wrote:
> Unofficial
> Hi All
>
> I've been working on this research question using panel data set and am having difficulties interpreting my stata output.
>
> Dataset: airfare.dta available on http://www.stata.com/texts/eacsap/.
>
> Research question: How certain key variables affect airfares in the U.S. market. In the near future
>
> I ran a pooled OLS regression: regress lfare ldist ldistsq y98 y99 y00
>
> Where:
> lfare - log transformed airfare variable ldist - log transformed distance variable ldistsq - (ldist)^2 y98, y99, y00 - year dummy variables
>
> I understand how to interpret coefficients under a log-log transformed model, and coefficients where it's a quadratic model, but when it's a quadratic log transformed variable I'm completely stuck!
>
> My output is as follows:
>
> . regress lfare ldist ldistsq y98 y99 y00
>
> Source | SS df MS Number of obs = 4596
> -------------+------------------------------ F( 5, 4590) = 581.09
> Model | 339.211826 5 67.8423653 Prob > F = 0.0000
> Residual | 535.882547 4590 .11675001 R-squared = 0.3876
> -------------+------------------------------ Adj R-squared = 0.3870
> Total | 875.094374 4595 .190444913 Root MSE = .34169
>
> ------------------------------------------------------------------------------
> lfare | Coef. Std. Err. t P>|t| [95% Conf. Interval]
> -------------+----------------------------------------------------------
> -------------+------
> ldist | -.783627 .1298635 -6.03 0.000 -1.038222 -.5290322
> ldistsq | .0897726 .0098112 9.15 0.000 .0705379 .1090072
> y98 | .024341 .0142555 1.71 0.088 -.0036067 .0522887
> y99 | .0350861 .0142555 2.46 0.014 .0071384 .0630338
> y00 | .0959191 .0142555 6.73 0.000 .0679714 .1238668
> _cons | 6.239633 .4270934 14.61 0.000 5.402325 7.076942
> ------------------------------------------------------------------------------
>
> Can someone explain how I interpret the coefficients for ldist and ldistsq?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/