Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: OLS assumptions not met: transformation, gls, or glm as solutions?
From
David Hoaglin <[email protected]>
To
[email protected]
Subject
Re: st: OLS assumptions not met: transformation, gls, or glm as solutions?
Date
Tue, 18 Dec 2012 17:49:26 -0500
> @ Maarten & David:
> About linearity: as independent variables, I mainly have categorical
> variables. So - scatter y x- or -graph matrix y x x- does not help
> much, because the cases are only on the lines for 0 and 1. How can I
> see whether I have a linear relationship between y and x, if x is
> categorical?
If the predictors are categorical, the focus in a discussion of
transformations shifts to promoting additivity of the contributions of
those predictors. Ideally, a model will have main effects for those
predictors and no interactions among them.
If the categorical predictors have more than 2 categories, it is
easier to derive information from the data that helps in choosing a
transformation that removes or reduces the nonadditivity.
> @ David:
> Yes, I think about transformation, and will read again about
> interpretation. Still, just having minutes to interpret would be
> easier, also for readers which are not so familiar with
> transformation. Also, I am not sure whether OLS with transformed
> dependent variable, or -glm- without transformed variable would be
> better.
As others have suggested, in a GLM a suitable choice of link function
may allow you to avoid transforming the dependent variable. But the
link function simply relates the conditional expectation of the
dependent variable to the linear component of the model. The random
component of the model handles the other features of the conditional
distribution. If the random component of the data (which will show up
in the residuals) is skewed, a choice other than the normal
distribution is indicated. The choice of link function would focus on
the structure in the data (perhaps additive).
David Hoaglin
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/