Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: retransformation of ln(Y) coefficient and CI in regression
From
Nick Cox <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
st: RE: retransformation of ln(Y) coefficient and CI in regression
Date
Sun, 5 Jun 2011 17:01:45 +0100
If you recast your model as
glm Y i.factor ... , link(log)
no post-estimation fudges are required. -predict- automatically supplies stuff in terms of Y, not ln Y.
Nick
[email protected]
Steve Rothenberg
I have a simple model with a natural log dependent variable and a three
level factor predictor. I've used
. regress lnY i.factor, vce(robust)
to obtain estimates in the natural log metric. I want to be able to display
the results in a graph as means and 95% CI for each level of the factor with
retransformed units in the original Y metric.
I've also calculated geometric means and 95% CI for each level of the factor
variable using
. ameans Y if factor==x
simply as a check, though the 95% CI is not adjusted for the vce(robust)
standard error as calculated by the -regress- model.
Using naïve transformation (i.e. ignoring retransformation bias) with
. display exp(coefficient)
from the output of -regress- for each level of the predictor, with the
classic formulation:
Level 0 = exp(constant)
Level 1 = exp(constant+coef(1))
Level 2 = exp(constant+coef(2))
the series of retransformations from the -regress- command is the same as
the geometric means from the series of -ameans- commands.
When I try to do the same with the lower and upper 95% CI (substituting the
limits of the 95% CI for the coefficients) from the -regress- command,
however, the retransformed IC is much larger than calculated from the-
ameans- command, much more so than the differences in standard errors from
regress with and without the vce(robust) option would indicate.
I've discovered -levpredict- for unbiased retransformation of log dependent
variables in regression-type estimations by Christopher Baum in SSC but it
only outputs the bias-corrected means from the preceding -regress-. To be
sure there is some small bias in the first or second decimal place of the
mean factor levels compared to naïve retransformation.
Am I doing something wrong by treating the 95% CI of each level of the
factor variable in the same way I treat the coefficients without correcting
for retransformation bias? Is there any way I can obtain either the
retransformed CI or the bias-corrected retransformed CI for the different
levels of the factor variable in the original metric of Y?
I'd like to retain the robust SE from the above estimation as there is
considerable difference in variance in each level of the factor variable.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/