Matthew got good answers (or pointers), but the point is general enough
to deserve elaboration.
Generalised linear models will typically give at least slightly
different results from an apparent equivalent using a transformed
response, if only because
link(mean(response))
is equivalent to
mean(trans(response))
only when link() and trans() are identical _and_ linear. Whenever link()
is not a linear function the two will give different answers.
That is apart from the key question of probability distribution family!
Another example is comparing the results of reciprocal transformation
and reciprocal link.
sysuse auto
gen gpm = 1 / mpg
regress gpm weight
predict reg_predict
replace reg_predict = 1/reg_predict
glm mpg weight, link(power -1)
predict glm_predict
scatter glm_predict reg_predict
corr *predict
assert reg_predict == glm_predict
The two sets of predictions are very close, but not identical.
Nick
[email protected]
Matthew G. Mercurio, Ph.D.
I read the Stata Manual section on GLM and checked a large number of
posts on Statalist related to loglinear models, but I was not able to
understand exactly why glm using link(log) doesn't produce the same
results as logging both variables and using reg. Based on my reading
of the Stata manual it appears to have someing to do with the fact that
the link() option relates to the expectation od the dependent variable,
not the dependent variable itself. Can anyone tell me why the results
are different?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/