Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: glm and reg produce different results for loglinear model?


From   "Matthew Mercurio (matthewmercurio)" <[email protected]>
To   [email protected]
Subject   st: glm and reg produce different results for loglinear model?
Date   Fri, 31 Oct 2008 15:43:20 -0700

I have two variables, 

(1) outagecost (estimated costs to each customer of a short electrical
power interuuption)
(2) mwhannual (annual megawatt hours of electricity consumption fpr each
customer)

Since these variables appear approximately lognormal, I have been
estimating the following simple model:

reg lnoutagecost lnmwhannual

where lnoutagecost and lnmwhannual represent the natural log of the two
variables desribed above.  The results are:

. reg lnoutagecost lnmwhannual

      Source |       SS       df       MS           Number of obs =
32345
-------------+------------------------------        F(  1, 32343) =
9370.20
       Model |  34151.9301     1  34151.9301        Prob > F      =
0.0000
    Residual |  117881.722 32343   3.6447368        R-squared     =
0.2246
-------------+------------------------------        Adj R-squared =
0.2246
       Total |  152033.652 32344  4.70052104        Root MSE      =
1.9091
------------------------------------------------------------------------
----
lnoutagecost |      Coef.   Std. Err.      t    P>|t|   [95% Conf.
Interval]
-------------+----------------------------------------------------------
----
 lnmwhannual |   .3824726   .0039512    96.80   0.000   .3747282
.3902171
       _cons |   5.370938   .0232302   231.21   0.000   5.325406
5.41647
------------------------------------------------------------------------
----

I then tried the following model in glm which I had expected to produce
identical results:

glm outagecost lnmwhannual, link(log)

Generalized linear models                        No. of obs      =
52418
Optimization     : ML                            Residual df     =
52416
                                                 Scale parameter =
7.59e+09
Deviance         =  3.97873e+14                  (1/df) Deviance =
7.59e+09
Pearson          =  3.97873e+14                  (1/df) Pearson  =
7.59e+09
Variance function: V(u) = 1                        [Gaussian]
Link function    : g(u) = ln(u)                    [Log]
                                                  AIC            =
25.5881
Log likelihood   = -670636.5416                   BIC            =
3.98e+14
------------------------------------------------------------------------
----
             |                 OIM
  outagecost |      Coef.   Std. Err.      z    P>|z|   [95% Conf.
Interval]
-------------+----------------------------------------------------------
----
 lnmwhannual |   .5568004   .0130092    42.80   0.000   .5313029
.5822979
       _cons |   5.355758   .1384432    38.69   0.000   5.084414
5.627102
------------------------------------------------------------------------
----

Obviously the results are very similar, but not identical.  

I read the Stata Manual section on GLM and checked a large number of
posts on Statalist related to loglinear models, but I was not able to
understand exactly why glm using link(log) doesn't produce the same
results as logging both variables and using reg.   Based on my reading
of the Stata manual it appears to have someing to do with the fact that
the link() option relates to the expectation od the dependent variable,
not the dependent variable itself.  Can anyone tell me why the results
are different?

Matthew G. Mercurio, Ph.D.
Senior Consultant
Freeman, Sullivan & Co.



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index