Completely orthogonal to Vince's very nice code
is a comment on the much analysed relationship
between mpg and weight. A comment made in many places
is that in many ways the reciprocal scale, gpm =
1 / mpg, is a more natural scale for analysis,
on elementary physical grounds. This is usually
followed by a transformation to -gpm- and a linear
regression. A comment made less often is that -glm- with
a reciprocal link offers another way to do it.
So,
sysuse auto
gen weight2 = weight^2
reg mpg weight weight2
predict p_quad
glm mpg weight , link(power -1)
predict p_glmrec
scatter mpg wei || mspline p_quad wei, bands(200) ||
mspline p_glm wei, bands(200)
-- and fortunately, or fortuitously, or both,
you can see that the two predictions are
essentially identical. The -glm- prediction
would, however, extrapolate rather better
and it is easier to entertain different error
families.
Just a thought, as Marcello might say.
Nick
[email protected]
Vince Wiggins,
> Scott Merryman <[email protected]> wrote,
>
> > In the June 2004 issue of the American Economic Review, the back
> > cover has an ad from Stata emphasizing the graphics of Stata 8. One
> > of the graphs shows a scatter plot with a regression line and
> > confidence interval densities. It looks something like the graph on
> > page 2 of
> >
> > http://www.asft.ttu.edu/ansc5403/lecture25.pdf
> >
> > How does one include the confidence densities in a
> regression line graph?
>
> This graph superimposes vertical density line plots for the
> distribution of
> the disturbances on a regression line. Such graphs are
> sometimes seen in
> textbooks when trying to provide intuition for linear
> regression. For data
> analysis, the confidence intervals shown by -twoway lfitci y
> x- are easier to
> read, but the graph from the ad has its own appeal. Here is
> the code used to
> produce that graph,
>
> ---------------------------------- BEGIN --- regline_ci.do
> --- CUT HERE -------
> clear
> sysuse auto
> keep if foreign
> sort weight
>
> gen weight2 = weight^2
> regress mpg weight weight2
> predict fit
> predict se , stdp
>
> #delimit ;
> twoway sc mpg weight , pstyle(p3) ms(o)
> ||
> fn weight[3] - 1000 * normden(x, `=fit[3]' , `=se[3]') ,
> range(`=fit[3] -5' `=fit[3] +5') horiz
> pstyle(p1) ||
> fn `=fit[3]' , range(`=weight[3]'
> `=weight[3]-1000*normden(0, se[3])')
> pstyle(p1)
> ||
> fn weight[17] - 1000 * normden(x, `=fit[17]', `=se[17]') ,
> range(`=fit[17]-5' `=fit[17]+5') horiz
> pstyle(p1) ||
> fn `=fit[17]', range(`=weight[17]'
> `=weight[17]-1000*normden(0, se[17])')
> pstyle(p1)
> ||
> fn weight[21] - 1000 * normden(x, `=fit[21]' , `=se[21]') ,
> range(`=fit[21] -7' `=fit[21] +7')
> horiz pstyle(p1) ||
> fn `=fit[21]', range(`=weight[21]'
> `=weight[21]-1000*normden(0, se[21])')
> pstyle(p1)
> ||
> line fit weight
> , clwidth(*2) legend(off) ytitle(Miles per gallon)
> xtitle(Weight)
> title("Scatter with Regression Line and Confidence
> Interval Densities"
> , size(4.8) margin(t=0 b=1.5) span)
> ;
> #delimit cr
> ---------------------------------- END --- regline_ci.do
> --- CUT HERE -------
>
> The graph is cute in that the CI densities are not notional,
> but rather the
> actual CIs from our regression of -mpg- on -weight- and
> -weight- squared. We
> have pulled the SE estimates from the regression fit, SEs
> obtained with
> -predict se , stdp-, at observations 3, 17, and 21 and
> supplied those to the
> -fn- (or -function-) plots using the -normden()- function to
> get our CI lines
> (we cheated ever so slightly and did not use a
> t-distribution). Note that we
> scale the result of -normden()- by 1000 so that it looks
> about right on the
> scale of the weight axis -- a scale that runs from 1,500 to
> 3,500. We need to
> do this because the X-axis is not scaled as a density. Our
> choice of 1000 as
> the scaling is arbitrary -- we can only compare the relative
> heights of the CI
> densities on this graph. We also took some care to get an
> appropriate range
> in the -mpg- dimension for each of our CI densities.
>
> The other three -fn- plots just draw the drop lines from the
> top of the CI
> densities to the regression line.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/