Dear Matthew,
see this Maarten's answer:
http://www.stata.com/statalist/archive/2008-10/msg01362.html
it´s other case, but I think that you can understand the difference.
HTH,
Joao Lima
2008/10/31 Matthew Mercurio (matthewmercurio) <[email protected]>:
> I have two variables,
>
> (1) outagecost (estimated costs to each customer of a short electrical
> power interuuption)
> (2) mwhannual (annual megawatt hours of electricity consumption fpr each
> customer)
>
> Since these variables appear approximately lognormal, I have been
> estimating the following simple model:
>
> reg lnoutagecost lnmwhannual
>
> where lnoutagecost and lnmwhannual represent the natural log of the two
> variables desribed above. The results are:
>
> . reg lnoutagecost lnmwhannual
>
> Source | SS df MS Number of obs =
> 32345
> -------------+------------------------------ F( 1, 32343) =
> 9370.20
> Model | 34151.9301 1 34151.9301 Prob > F =
> 0.0000
> Residual | 117881.722 32343 3.6447368 R-squared =
> 0.2246
> -------------+------------------------------ Adj R-squared =
> 0.2246
> Total | 152033.652 32344 4.70052104 Root MSE =
> 1.9091
> ------------------------------------------------------------------------
> ----
> lnoutagecost | Coef. Std. Err. t P>|t| [95% Conf.
> Interval]
> -------------+----------------------------------------------------------
> ----
> lnmwhannual | .3824726 .0039512 96.80 0.000 .3747282
> .3902171
> _cons | 5.370938 .0232302 231.21 0.000 5.325406
> 5.41647
> ------------------------------------------------------------------------
> ----
>
> I then tried the following model in glm which I had expected to produce
> identical results:
>
> glm outagecost lnmwhannual, link(log)
>
> Generalized linear models No. of obs =
> 52418
> Optimization : ML Residual df =
> 52416
> Scale parameter =
> 7.59e+09
> Deviance = 3.97873e+14 (1/df) Deviance =
> 7.59e+09
> Pearson = 3.97873e+14 (1/df) Pearson =
> 7.59e+09
> Variance function: V(u) = 1 [Gaussian]
> Link function : g(u) = ln(u) [Log]
> AIC =
> 25.5881
> Log likelihood = -670636.5416 BIC =
> 3.98e+14
> ------------------------------------------------------------------------
> ----
> | OIM
> outagecost | Coef. Std. Err. z P>|z| [95% Conf.
> Interval]
> -------------+----------------------------------------------------------
> ----
> lnmwhannual | .5568004 .0130092 42.80 0.000 .5313029
> .5822979
> _cons | 5.355758 .1384432 38.69 0.000 5.084414
> 5.627102
> ------------------------------------------------------------------------
> ----
>
> Obviously the results are very similar, but not identical.
>
> I read the Stata Manual section on GLM and checked a large number of
> posts on Statalist related to loglinear models, but I was not able to
> understand exactly why glm using link(log) doesn't produce the same
> results as logging both variables and using reg. Based on my reading
> of the Stata manual it appears to have someing to do with the fact that
> the link() option relates to the expectation od the dependent variable,
> not the dependent variable itself. Can anyone tell me why the results
> are different?
>
> Matthew G. Mercurio, Ph.D.
> Senior Consultant
> Freeman, Sullivan & Co.
>
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
--
-------------------------------
Joao Ricardo Lima
Professor
UFPB-CCA-DCFS
+553138923914
-------------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/