Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: retransformation of ln(Y) coefficient and CI in regression
From
Roger Newson <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: retransformation of ln(Y) coefficient and CI in regression
Date
Mon, 6 Jun 2011 10:31:46 +0100
The -regress- command has an -eform- option, which gives the confidence
limits of geometric means and their ratios. This is described in Newson
(2003), and can be used together with -robust- to display
unequal-variance confidence limits.
And, if you want to plot the confidence limits against the factor
values, then you might like to use the -parmest-, -eclplot-, -fvregen-
and -descsave- packages, downloadable from SSC. As in:
tempfile df0
descsave factor, do(`"`df0'"', replace)
regress lnY ibn.factor, vce(robust) noconst eform(GM/Ratio)
parmest, norestore eform
fvregen, do(`"`df0'"')
eclplot estimate min* max* factor
In this example, we start by defining a temporary file whose macro name
is -df0-. We then use -descsave- (an extended version of -describe-
which can create output do-files) to write a do-file to that temporary
file, defining the variable attributes (storage type, format, variable
label and value label) of the variable -factor-. We then use -regress-,
with the -eform(GM)- option to specify confidence limits for geometric
means and/or their ratios, and the -noconst- option and the X-variable
list -ibn.factor- to specify that the parameters will be geometric means
instead of ratios. We then use -parmest- to overwrite the existing
dataset in memory with an output dataset (or resultsset), with 1
observation per parameter and data on parameter names, estimates,
confidence limits and other parameter attributes. In this new output
dataset, we then use -fvregen- to regenerate the variable -factor- from
the parameter names. Finally, we use -eclplot- to produce a confidence
interval plot, with the values of -factor- on the X-axis and the
estimates and unequal-variance confidence limits for the corresponding
geometric means on the Y-axis. More about all these packages can be
found in the on-line help for -parmest-, which contains many hypertext
references.
I hope this helps.
Best wishes
Roger
References
Newson R. Stata tip 1: The eform() option of regress. The Stata Journal
2003; 3(4): 445. Download from
http://www.stata-journal.com/article.html?article=st0054
Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: [email protected]
Web page: http://www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:
http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/
Opinions expressed are those of the author, not of the institution.
On 05/06/2011 16:26, Steve Rothenberg wrote:
I have a simple model with a natural log dependent variable and a three
level factor predictor. I’ve used
. regress lnY i.factor, vce(robust)
to obtain estimates in the natural log metric. I want to be able to display
the results in a graph as means and 95% CI for each level of the factor with
retransformed units in the original Y metric.
I’ve also calculated geometric means and 95% CI for each level of the factor
variable using
. ameans Y if factor==x
simply as a check, though the 95% CI is not adjusted for the vce(robust)
standard error as calculated by the -regress- model.
Using naïve transformation (i.e. ignoring retransformation bias) with
. display exp(coefficient)
from the output of -regress- for each level of the predictor, with the
classic formulation:
Level 0 = exp(constant)
Level 1 = exp(constant+coef(1))
Level 2 = exp(constant+coef(2))
the series of retransformations from the -regress- command is the same as
the geometric means from the series of -ameans- commands.
When I try to do the same with the lower and upper 95% CI (substituting the
limits of the 95% CI for the coefficients) from the -regress- command,
however, the retransformed IC is much larger than calculated from the-
ameans- command, much more so than the differences in standard errors from
regress with and without the vce(robust) option would indicate.
I’ve discovered -levpredict- for unbiased retransformation of log dependent
variables in regression-type estimations by Christopher Baum in SSC but it
only outputs the bias-corrected means from the preceding -regress-. To be
sure there is some small bias in the first or second decimal place of the
mean factor levels compared to naïve retransformation.
Am I doing something wrong by treating the 95% CI of each level of the
factor variable in the same way I treat the coefficients without correcting
for retransformation bias? Is there any way I can obtain either the
retransformed CI or the bias-corrected retransformed CI for the different
levels of the factor variable in the original metric of Y?
I'd like to retain the robust SE from the above estimation as there is
considerable difference in variance in each level of the factor variable.
Steve Rothenberg
National Institute of Public Health
Cuernavaca, Morelos, Mexico
Stata/MP 11.2 for Windows (32-bit)
Born 30 Mar 2011
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/