Scott Merryman <[email protected]> wrote,
> In the June 2004 issue of the American Economic Review, the back
> cover has an ad from Stata emphasizing the graphics of Stata 8. One
> of the graphs shows a scatter plot with a regression line and
> confidence interval densities. It looks something like the graph on
> page 2 of
> How does one include the confidence densities in a regression line graph?
This graph superimposes vertical density line plots for the distribution of
the disturbances on a regression line. Such graphs are sometimes seen in
textbooks when trying to provide intuition for linear regression. For data
analysis, the confidence intervals shown by -twoway lfitci y x- are easier to
read, but the graph from the ad has its own appeal. Here is the code used to
produce that graph,
---------------------------------- BEGIN --- --- CUT HERE -------
sysuse auto
keep if foreign
sort weight
gen weight2 = weight^2
regress mpg weight weight2
predict fit
predict se , stdp
#delimit ;
twoway sc mpg weight , pstyle(p3) ms(o) ||
fn weight[3] - 1000 * normden(x, `=fit[3]' , `=se[3]') ,
range(`=fit[3] -5' `=fit[3] +5') horiz pstyle(p1) ||
fn `=fit[3]' , range(`=weight[3]' `=weight[3]-1000*normden(0, se[3])')
pstyle(p1) ||
fn weight[17] - 1000 * normden(x, `=fit[17]', `=se[17]') ,
range(`=fit[17]-5' `=fit[17]+5') horiz pstyle(p1) ||
fn `=fit[17]', range(`=weight[17]' `=weight[17]-1000*normden(0, se[17])')
pstyle(p1) ||
fn weight[21] - 1000 * normden(x, `=fit[21]' , `=se[21]') ,
range(`=fit[21] -7' `=fit[21] +7') horiz pstyle(p1) ||
fn `=fit[21]', range(`=weight[21]' `=weight[21]-1000*normden(0, se[21])')
pstyle(p1) ||
line fit weight
, clwidth(*2) legend(off) ytitle(Miles per gallon) xtitle(Weight)
title("Scatter with Regression Line and Confidence Interval Densities"
, size(4.8) margin(t=0 b=1.5) span)
#delimit cr
---------------------------------- END --- --- CUT HERE -------
The graph is cute in that the CI densities are not notional, but rather the
actual CIs from our regression of -mpg- on -weight- and -weight- squared. We
have pulled the SE estimates from the regression fit, SEs obtained with
-predict se , stdp-, at observations 3, 17, and 21 and supplied those to the
-fn- (or -function-) plots using the -normden()- function to get our CI lines
(we cheated ever so slightly and did not use a t-distribution). Note that we
scale the result of -normden()- by 1000 so that it looks about right on the
scale of the weight axis -- a scale that runs from 1,500 to 3,500. We need to
do this because the X-axis is not scaled as a density. Our choice of 1000 as
the scaling is arbitrary -- we can only compare the relative heights of the CI
densities on this graph. We also took some care to get an appropriate range
in the -mpg- dimension for each of our CI densities.
The other three -fn- plots just draw the drop lines from the top of the CI
densities to the regression line.
-- Vince
[email protected]
* For searches and help try: