Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: RE: retransformation of ln(Y) coefficient and CI in regression
From
"Steve Rothenberg" <[email protected]>
To
<[email protected]>
Subject
st: RE: RE: retransformation of ln(Y) coefficient and CI in regression
Date
Tue, 7 Jun 2011 17:48:28 -0500
I had already posted my re-discovery of the -predictnl- options, suggested
by Martin Weiss, my search provoked by Nick Cox's suggestion to use the
options with -predict- after -glm ln(Y) i.factor, vce(robust)- estimation,
before I discovered Roger Newson's and Martin Buis's elegant treatments of
the problem, using the -eform- option for -regress-, listed below.
Thanks for the additional code, good folks, and for all the help.
Steve Rothenberg
*******************
Date: Mon, 6 Jun 2011 10:31:46 +0100
From: Roger Newson <[email protected]>
Subject: Re: st: retransformation of ln(Y) coefficient and CI in regression
The -regress- command has an -eform- option, which gives the confidence
limits of geometric means and their ratios. This is described in Newson
(2003), and can be used together with -robust- to display
unequal-variance confidence limits.
And, if you want to plot the confidence limits against the factor
values, then you might like to use the -parmest-, -eclplot-, -fvregen-
and -descsave- packages, downloadable from SSC. As in:
tempfile df0
descsave factor, do(`"`df0'"', replace)
regress lnY ibn.factor, vce(robust) noconst eform(GM/Ratio)
parmest, norestore eform
fvregen, do(`"`df0'"')
eclplot estimate min* max* factor
In this example, we start by defining a temporary file whose macro name
is -df0-. We then use -descsave- (an extended version of -describe-
which can create output do-files) to write a do-file to that temporary
file, defining the variable attributes (storage type, format, variable
label and value label) of the variable -factor-. We then use -regress-,
with the -eform(GM)- option to specify confidence limits for geometric
means and/or their ratios, and the -noconst- option and the X-variable
list -ibn.factor- to specify that the parameters will be geometric means
instead of ratios. We then use -parmest- to overwrite the existing
dataset in memory with an output dataset (or resultsset), with 1
observation per parameter and data on parameter names, estimates,
confidence limits and other parameter attributes. In this new output
dataset, we then use -fvregen- to regenerate the variable -factor- from
the parameter names. Finally, we use -eclplot- to produce a confidence
interval plot, with the values of -factor- on the X-axis and the
estimates and unequal-variance confidence limits for the corresponding
geometric means on the Y-axis. More about all these packages can be
found in the on-line help for -parmest-, which contains many hypertext
references.
I hope this helps.
Best wishes
Roger
On Sun, Jun 5, 2011 at 6:55 PM, Steve Rothenberg wrote:
> . glm Y i.factor, vce(robust) family(Gaussian) link(log)
>
> followed by
>
> . predict xxx, mu
>
> the command does indeed return the factor predictions in the original Y
> metric.
>
> However, the regression table with 95% CI is still in the original ln(Y)
> units and I am still stuck not being able to calculate the 95% CI in the
> original Y unit metric.
As for the regression table, you can your coefficients in the y metric
by specifying the -eform- option:
*-------------- begin example -----------------
sysuse auto, clear
gen byte baseline = 1
gen c_mpg = mpg - 20
glm price c_mpg foreign baseline, ///
link(log) nocons eform
*---------------- end example ----------------
In this example the domestic cars with 20 miles per gallon cost on
average 5,735 dollars. This price increases by a factor 1.36, i.e.
36%, when the car is foreign and decreases by a factor .93, i.e. -7%,
for every mile per gallon increase in mileage.
> The predict command for returning prediction SE
> (stdp) also only returns the SE in the ln(Y) metric.
>
> I'd welcome further suggestions for deriving the 95% confidence interval
in
> the original Y metric after either
For that type of problem I like the old -adjust- command, see: -help
adjust-. That help file says that it is superseded by the -margins-
command, but it is much easier to use if you want to create variables
(e.g. as preparation for creating graphs).
Hope this helps,
Maarten
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/