Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: R: OLS assumptions not met: transformation, gls, or glm as solutions?

From   [email protected]
To   <[email protected]>
Subject   st: R: OLS assumptions not met: transformation, gls, or glm as solutions?
Date   Mon, 17 Dec 2012 12:09:46 +0100

The first Laura's query is: 
1. Keep the model and the variables as they are (but maybe use robust
standard errors) - is this possible under certain conditions, even if I have
heteroskedasticity and non-normality of residuals, and when is this

Using robust standard errors will not always shelter you for
heteroskedasticity, as you can see from the following (misspecified)
sysuse auto.dta
reg price mpg weight
estat hettest
reg price mpg weight, robust
predict res, residuals
qnorm res, grid
Besides, after invoking -reg y x, robust - Stata rejects (for methodological
reasons) -estat hettest-. Hence, a graphical test (always advisable, anyway)
is a helpful way to go.

Best Regards,
-----Messaggio originale-----
Da: [email protected]
[mailto:[email protected]] Per conto di Laura R.
Inviato: lunedì 17 dicembre 2012 11:44
A: [email protected]
Oggetto: st: OLS assumptions not met: transformation, gls, or glm as

Dear Stata users,

I estimated an OLS model with the number of minutes (1-1440) spent on an
activity on a day as dependent variable. At first sight, the model works
fine. I receive some interesting results which are robust across model
specifications. I would like to keep it as it is, but:

- The regression diagnostics shows that the error terms are not normally
distributed, but right skewed.

- In addition, there is heteroskedasticity.

Excluding outliers and influential cases does not help. Now I can think
about 4 solutions, but I am not sure when it is justified to decide on one
of these:

1. Keep the model and the variables as they are (but maybe use robust
standard errors) - is this possible under certain conditions, even if I have
heteroskedasticity and non-normality of residuals, and when is this

2. Transform the dependent variable. If I take the ln of the dependent
variable, the residuals get closer to a normal distribution, and it gets
closer to homoskedasticity. But then there is the problem of interpreting
the results.

3. Generalised least square model (gls): Use this instead. This is a
solution to heteroskedasticity, but do the residuals have to be normally
distributed in gls as well? What other new assumptions of gls might cause
new problems (pros/cons gls vs. OLS)? And how can I do this in Stata?
(Somehow with calculating a weight, I think...)

4. Generalised linear model (glm): In some sources I read that this also
accounts for heteroskedasticity, in other sources not. Again, what about the
normal distribution of residuals here? I heard that glm is better than OLS
for non-negative dependent variables, is that correct? What are other
assumptions of gls that could make me still prefer OLS? If I used it ,and if
my dependent variable is non-negative, and residuals are right skewed, do I
have to "tell" that Stata when estimating the model, or can I run it as it

(I quickly ran -glm- already, without any special specifications, and the
results are the same as from the OLS model.)

In sum, I need some decision-making support. What is the best thing to do in
this case?
One thing that would help is a comparison of assumptions of OLS, gls, glm. I
am aware of the assumptions of OLS models, but for gls and glm I did not
find comprehensive lists and explanations.

It would be great if you could give me hints on what would be a good
solution. Maybe you know a source explaining when to use which solution if
OLS assumptions of normality and homoskedasticity are not met.


PS: I am aware of the fact that many used Tobit for similar dependent
variables, including the zeros. My case is different, and for some reason I
do not want to do this, and I excluded the zeros.
*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index