Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: OLS assumptions not met: transformation, gls, or glm as solutions?
From
Jeph Herrin <[email protected]>
To
[email protected]
Subject
Re: st: OLS assumptions not met: transformation, gls, or glm as solutions?
Date
Mon, 17 Dec 2012 11:18:29 -0500
In addition to the excellent advice posted here by others, I'll add that
you might consider: 1) looking at the distribution of the dependent
variable (minutes) to see if it is symmetric and 2) if not, look up the
(modified) Park test, which is a rule of thumb for selecting a GLM
model. It's not too hard to do your own, but you can find a Stata
package for performing the Park test at
http://www.uphs.upenn.edu/dgimhsr/stat-cstanal.htm
cheers,
Jeph
On 12/17/2012 9:29 AM, Laura R. wrote:
Thank you very much for your help so far.
Please let me reply one by one.
@ Carlo: I conducted your example and with my data it seems the same,
the -robust- option does not seem to change the graphical pictures or
the tests (-estat hettest-, -iqr-) much. So the robust option has to
be visible in the graphics and the tests, that it induced
homoskedasticity?
@ Nick:
As to the equality of variances between the cases from the 2 surveys,
a referee seems concerned about inferences one can make from the
descriptive statistics. Therefore, I would like to use -sdtest- to see
whether variances are the same in the two samples.
And for the regression, I think that adding the year-dummy would be
enough to account for it?
The variances of the regression residuals are another thing, this is
for model validation. Yes, there I plotted the residuals, and the
variances seem to become larger as the dep. var. becomes larger,
especially the lower bound (with negative values) changes.
@ Maarten:
So you would not worry about heteroskedasticity or the distribution of
errors. What would you write in the paper then? "There is
heteroskedasticity and non-normal error distribution, but I still use
OLS because ...?" I am very curious, because I would like to keep the
OLS
@ Maarten & David:
About linearity: as independent variables, I mainly have categorical
variables. So - scatter y x- or -graph matrix y x x- does not help
much, because the cases are only on the lines for 0 and 1. How can I
see whether I have a linear relationship between y and x, if x is
categorical?
@ David:
Yes, I think about transformation, and will read again about
interpretation. Still, just having minutes to interpret would be
easier, also for readers which are not so familiar with
transformation. Also, I am not sure whether OLS with transformed
dependent variable, or -glm- without transformed variable would be
better.
Laura
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/