Title | Calculating proportional change of categorical covariates | |
Author | Chris Cheng, Staff Econometrician |
Suppose that, after fitting a linear regression such as
$$\mathsf E[y|x] = a + b*x$$we would like to obtain the proportional change in y for a change from 0 to 1 in binary variable x using the following formula:
$$\frac {\mathsf E(\hat{y}|x = 1) - \mathsf E(\hat{y}|x = 0)}{\mathsf E(\hat{y}|x = 0)}$$Here is an example of how we can compute this proportional change manually:
. sysuse auto, clear (1978 automobile data) . quietly regress price mpg turn i.foreign . generate price0 = _b[_cons] + _b[mpg]*mpg + _b[turn]*turn . generate price1 = _b[_cons] + _b[mpg]*mpg + _b[turn]*turn + _b[1.foreign] . generate propch = (price1 - price0)/price0 . summarize propch
Variable | Obs Mean Std. dev. Min Max | |
propch | 74 .6437617 1.405657 .2495446 12.36855 |
If we would like to compute its associated standard error, we can use the margins command with the expression() option:
. margins, expression(_b[1.foreign]/(_b[_cons] + _b[mpg]*mpg + _b[turn]*turn)) warning: option expression() does not contain option predict() or xb(). Predictive margins Number of obs = 74 Model VCE: OLS Expression: _b[1.foreign]/(_b[_cons] + _b[mpg]*mpg + _b[turn]*turn)
Delta-method | ||
Margin std. err. z P>|z| [95% conf. interval] | ||
_cons | .6437617 1.282864 0.50 0.616 -1.870606 3.15813 | |
If x is a categorical variable with more than two groups, we can similarly type
.webuse lbw, clear (Hosmer & Lemeshow data) . quietly regress bwt age lwt i.race . generate bwt1 = _b[_cons] + _b[lwt]*lwt . generate bwt2 = _b[_cons] + _b[lwt]*lwt + _b[2.race] . generate bwt3 = _b[_cons] + _b[lwt]*lwt + _b[3.race] . generate prop21 = (bwt2 - bwt1)/bwt1 . generate prop31 = (bwt3 - bwt1)/bwt1 . summarize prop21 prop31
Variable | Obs Mean Std. dev. Min Max | |
prop21 | 189 -.1465782 .0063929 -.1581653 -.123847 | |
prop31 | 189 -.078879 .0034403 -.0851145 -.0666465 |
Delta-method | ||
dy/dx std. err. z P>|z| [95% conf. interval] | ||
race | ||
Black | -.1465782 .0502353 -2.92 0.004 -.2450377 -.0481188 | |
Other | -.078879 .0361615 -2.18 0.029 -.1497543 -.0080038 | |
In the margins command, we can specify the expression() and dydx() options in a somewhat tricky way in order to get the same proportional change formulas that we obtained in the previous manual computations. We include the difference for each group compared with the base times an indicator for that group in the numerator of our expression. After taking the derivative using dydx() with respect to each level, we end up with proportional changes for each level, race = 2 and race = 3, compared with the base, race = 1. For more details, please see the blog post Using the margins command with different functional forms: Proportional versus natural logarithm changes. This post also contains examples for nonlinear models.