I am using margins after an estimation that has time-series operators in the
independent variable list. How does margins calculate the means of the
independent variables?
Title
|
|
Marginal effects and time-series operators
|
Author
|
May Boggess, StataCorp
|
The way Stata commands can interact with time-series operators is really neat.
For example, you can
summarize the first difference of a variable without having to create a
new variable containing the first differences. So,
margins has no
trouble getting the means:
. sysuse auto, clear
(1978 Automobile Data)
. generate t=_n
. tsset t
Time variable: t, 1 to 74
Delta: 1 unit
. regress mpg L(0/2).turn
Source | | SS df MS | Number of obs = 72 |
| | | F(3, 68) = 25.40 |
Model | | 1281.09504 3 427.031681 | Prob > F = 0.0000 |
Residual | | 1143.2244 68 16.8121235 | R-squared = 0.5284 |
| | | Adj R-squared = 0.5076 |
Total | | 2424.31944 71 34.1453443 | Root MSE = 4.1003 |
|
mpg | | Coefficient Std. err. t P>|t| [95% conf. interval] |
| | |
turn | | |
--. | | -.9320462 .1237537 -7.53 0.000 -1.178993 -.6850996 |
L1. | | -.1183241 .1266935 -0.93 0.354 -.3711369 .1344888 |
L2. | | .0993164 .1244224 0.80 0.428 -.1489645 .3475974 |
| | |
_cons | | 59.04122 5.622999 10.50 0.000 47.8207 70.26173 |
|
.
summarize turn if e(sample)
Variable | | Obs Mean Std. dev. Min Max |
| | |
turn | | 72 39.63889 4.460486 31 51 |
.
summarize L.turn if e(sample)
Variable | | Obs Mean Std. dev. Min Max |
| | |
turn | | |
L1. | | 72 39.68056 4.449486 31 51 |
.
summarize L2.turn if e(sample)
Variable | | Obs Mean Std. dev. Min Max |
| | |
turn | | |
L2. | | 72 39.73611 4.427803 31 51 |
.
margins, dydx(_all) atmeans nose
Conditional marginal effects Number of obs = 72
Expression: Linear prediction, predict()
dy/dx wrt: turn L.turn L2.turn
At: turn = 39.63889 (mean)
L.turn = 39.68056 (mean)
L2.turn = 39.73611 (mean)
|
| | dy/dx |
| | |
turn | | |
--. | | -.9320462 |
L1. | | -.1183241 |
L2. | | .0993164 |
|
Note the if e(sample) on each
summarize.
We can double-check how many observations were used in the estimation, only
72, even though there are no missing values in mpg
or turn. Yet there are two observations
missing.
This is because we have used the lag L2, so L2.turn
is missing in the first and second observation, and thus, those two will not
be used in the estimation.
You can be forgiven for thinking that I could have been more efficient if I
have just thrown away the observations that are not in the e(sample)
before I used summarize. That would be OK if it
weren’t for the time-series operators:
. sysuse auto, clear
(1978 Automobile Data)
. generate t=_n
. tsset t
Time variable: t, 1 to 74
Delta: 1 unit
. regress mpg L(2/3).turn
Source | | SS df MS | Number of obs = 71 |
| | | F(2, 68) = 2.85 |
Model | | 187.429767 2 93.7148836 | Prob > F = 0.0648 |
Residual | | 2236.45756 68 32.8890817 | R-squared = 0.0773 |
| | | Adj R-squared = 0.0502 |
Total | | 2423.88732 70 34.6269618 | Root MSE = 5.7349 |
|
mpg | | Coefficient Std. err. t P>|t| [95% conf. interval] |
| | |
turn | | |
L2. | | -.2030227 .1684267 -1.21 0.232 -.539113 .1330675 |
L3. | | -.2356865 .1698156 -1.39 0.170 -.5745484 .1031753 |
| | |
_cons | | 38.7856 7.345561 5.28 0.000 24.12776 53.44344 |
|
.
summarize L2.turn if e(sample)
Variable | | Obs Mean Std. dev. Min Max |
| | |
turn | | |
L2. | | 71 39.73239 4.459205 31 51 |
.
summarize L3.turn if e(sample)
Variable | | Obs Mean Std. dev. Min Max |
| | |
turn | | |
L3. | | 71 39.80282 4.422733 31 51 |
.
margins, eydx(L2.turn L3.turn) atmeans nose
Conditional marginal effects Number of obs = 71
Expression: Linear prediction, predict()
ey/dx wrt: L2.turn L3.turn
At: L2.turn = 39.73239 (mean)
L3.turn = 39.80282 (mean)
|
| | ey/dx |
| | |
turn | | |
L2. | | -.0095146 |
L3. | | -.0110454 |
|
.
keep if e(sample)
(3 observations deleted)
.
summarize L2.turn
Variable | | Obs Mean Std. dev. Min Max |
| | |
turn | | |
L2. | | 69 39.7971 4.48717 31 51 |
.
summarize L3.turn
Variable | | Obs Mean Std. dev. Min Max |
| | |
turn | | |
L3. | | 68 39.86765 4.481819 31 51 |
Now we can see that the means obtained after throwing observations away are
not the correct ones. This is because we need all the observations to create
L2.turn and
L3.turn the same way they were during the
estimation.