Title | Marginal effects and the nodrop option | |
Author | May Boggess, StataCorp | |
Date | April 2004 |
In calculating a marginal effect, mfx must use the predict command to obtain the values of the prediction function. The predict command works by creating a new variable and putting the predicted value for each observation into that new variable. So, the prediction goes into the dataset.
How many observations does mfx need to predict into in order to function properly? Well, most of the time, the answer is one, and to save computation time, mfx temporarily drops all the observations except the first one in the e(sample), if it is safe to do so.
How do you know if mfx has concluded that it is safe to work with only one observation? You can use the diagnostics(drop) option. Let's look at an example:
. webuse friedman2, clear . keep if tin( ,1981q4) (67 observations deleted) . arima consump m2, ar(1) ma(1) nolog ARIMA regression Sample: 1959q1 to 1981q4 Number of obs = 92 Wald chi2(3) = 4394.80 Log likelihood = -340.5077 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ | OPG consump | Coef. Std. Err. z p>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- consump | m2 | 1.122029 .0363563 30.86 0.000 1.050772 1.193286 _cons | -36.09872 56.56703 -0.64 0.523 -146.9681 74.77062 -------------+---------------------------------------------------------------- ARMA | ar | L1 | .9348486 .0411323 22.73 0.000 .8542308 1.015467 ma | L1 | .3090592 .0885883 3.49 0.000 .1354293 .4826891 -------------+---------------------------------------------------------------- /sigma | 9.655308 .5635157 17.13 0.000 8.550837 10.75978 ------------------------------------------------------------------------------ . mfx, predict(xb structural) diagnostics(drop) Predict into observation 1 = 828.33238 Predict error after drop. note: nodrop option enforced. All e(sample) observations kept: N = 92 Marginal effects after arima y = xb prediction, structural one-step (predict, xb structural) = 828.33239 ------------------------------------------------------------------------------ variable | dy/dx Std. Err. z p>|z| [ 95% C.I. ] X ---------+-------------------------------------------------------------------- m2 | 1.122029 .03636 30.86 0.000 1.05077 1.19329 770.418 ------------------------------------------------------------------------------
We see that we get a predict error after keeping only one observation. Let's use predict by itself and see if the same thing happens:
. webuse friedman2, clear . keep if tin( ,1981q4) (67 observations deleted) . quietly arima consump m2, ar(1) ma(1) nolog . keep if e(sample) (52 observations deleted) . keep in 1 (91 observations deleted) . predict xb, xb structural Obs. nos. out of range r(198);
If we rerun predict with set trace on, we see that it is referring to the other observations, which are now not there. So, in this example, it is certainly safest to keep all observations in memory during mfx. A marginal effect is a derivative of a function. By using the predict option of mfx, we specify the function for which we would like marginal effects. If none was specified, the default prediction option for the preceding estimation command is used.
In calculating a marginal effect, mfx must use the predict command to obtain the values of the prediction function. The predict command works by creating a new variable and putting the predicted value for each observation into that new variable. So, the prediction goes into the dataset.
How many observations does mfx need to predict into in order to function properly? Well, most of the time, the answer is one, and to save computation time, mfx temporarily drops all the observations except the first one in the e(sample), if it is safe to do so.
How do you know if mfx has concluded that it is safe to work with only one observation? You can use the diagnostics(drop) option. Let's look at an example:
. webuse friedman2, clear . keep if tin( ,1981q4) (67 observations deleted) . arima consump m2, ar(1) ma(1) nolog ARIMA regression Sample: 1959q1 to 1981q4 Number of obs = 92 Wald chi2(3) = 4394.80 Log likelihood = -340.5077 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ | OPG consump | Coef. Std. Err. z p>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- consump | m2 | 1.122029 .0363563 30.86 0.000 1.050772 1.193286 _cons | -36.09872 56.56703 -0.64 0.523 -146.9681 74.77062 -------------+---------------------------------------------------------------- ARMA | ar | L1 | .9348486 .0411323 22.73 0.000 .8542308 1.015467 ma | L1 | .3090592 .0885883 3.49 0.000 .1354293 .4826891 -------------+---------------------------------------------------------------- /sigma | 9.655308 .5635157 17.13 0.000 8.550837 10.75978 ------------------------------------------------------------------------------ . mfx, predict(xb structural) diagnostics(drop) Predict into observation 1 = 828.33238 Predict error after drop. note: nodrop option enforced. All e(sample) observations kept: N = 92 Marginal effects after arima y = xb prediction, structural one-step (predict, xb structural) = 828.33239 ------------------------------------------------------------------------------ variable | dy/dx Std. Err. z p>|z| [ 95% C.I. ] X ---------+-------------------------------------------------------------------- m2 | 1.122029 .03636 30.86 0.000 1.05077 1.19329 770.418 ------------------------------------------------------------------------------
We see that we get a predict error after keeping only one observation. Let's use predict by itself and see if the same thing happens:
. webuse friedman2, clear . keep if tin( ,1981q4) (67 observations deleted) . quietly arima consump m2, ar(1) ma(1) nolog . keep if e(sample) (52 observations deleted) . keep in 1 (91 observations deleted) . predict xb, xb structural Obs. nos. out of range r(198);
If we rerun predict with set trace on, we see that it is referring to the other observations, which are now not there. So, in this example, it is certainly safest to keep all observations in memory during mfx.
We use the nodrop option to specify that mfx keep all the observations in memory during its calculations. Let's see how it works:
. webuse sysdsn3, clear (Health insurance data) . mlogit insure age male nonwhite site2 site3, nolog Multinomial logistic regression Number of obs = 615 LR chi2(10) = 42.99 Prob > chi2 = 0.0000 Log likelihood = -534.36165 Pseudo R2 = 0.0387 ------------------------------------------------------------------------------ insure | Coef. Std. Err. z p>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- Prepaid | age | -.011745 .0061946 -1.90 0.058 -.0238862 .0003962 male | .5616934 .2027465 2.77 0.006 .1643175 .9590693 nonwhite | .9747768 .2363213 4.12 0.000 .5115955 1.437958 site2 | .1130359 .2101903 0.54 0.591 -.2989296 .5250013 site3 | -.5879879 .2279351 -2.58 0.010 -1.034733 -.1412433 _cons | .2697127 .3284422 0.82 0.412 -.3740222 .9134476 -------------+---------------------------------------------------------------- Uninsure | age | -.0077961 .0114418 -0.68 0.496 -.0302217 .0146294 male | .4518496 .3674867 1.23 0.219 -.268411 1.17211 nonwhite | .2170589 .4256361 0.51 0.610 -.6171725 1.05129 site2 | -1.211563 .4705127 -2.57 0.010 -2.133751 -.2893747 site3 | -.2078123 .3662926 -0.57 0.570 -.9257327 .510108 _cons | -1.286943 .5923219 -2.17 0.030 -2.447872 -.1260135 ------------------------------------------------------------------------------ (Outcome insure==Indemnity is the comparison group) . mfx, predict(p outcome(1)) diagnostics(drop) Predict into observation 1 = .48179251 Predict into obs 1 after drop = .48179251 Keep first e(sample) observation only. Marginal effects after mlogit y = Pr(insure==1) (predict, p outcome(1)) = .48179251 ------------------------------------------------------------------------------ variable | dy/dx Std. Err. z p>|z| [ 95% C.I. ] X ---------+-------------------------------------------------------------------- age | .0028073 .00148 1.90 0.058 -.000096 .005711 44.4683 male*| -.1347111 .04683 -2.88 0.004 -.226494 -.042929 .250407 nonwhite*| -.2138472 .05074 -4.21 0.000 -.313297 -.114397 .196748 site2*| .0096603 .05082 0.19 0.849 -.089942 .109263 .370732 site3*| .1333108 .05294 2.52 0.012 .029558 .237064 .313821 ------------------------------------------------------------------------------ (*) dy/dx is for discrete change of dummy variable from 0 to 1 . mfx, predict(p outcome(1)) diagnostics(drop) nodrop Predict into observation 1 = .48179251 Predict into obs 1 after drop = .48179251 All e(sample) observations kept: N = 615 Marginal effects after mlogit y = Pr(insure==1) (predict, p outcome(1)) = .48179251 ------------------------------------------------------------------------------ variable | dy/dx Std. Err. z p>|z| [ 95% C.I. ] X ---------+-------------------------------------------------------------------- age | .0028073 .00148 1.90 0.058 -.000096 .005711 44.4683 male*| -.1347111 .04683 -2.88 0.004 -.226494 -.042929 .250407 nonwhite*| -.2138472 .05074 -4.21 0.000 -.313297 -.114397 .196748 site2*| .0096603 .05082 0.19 0.849 -.089942 .109263 .370732 site3*| .1333108 .05294 2.52 0.012 .029558 .237064 .313821 ------------------------------------------------------------------------------ (*) dy/dx is for discrete change of dummy variable from 0 to 1
The results are the same, as we would expect, but if you run this example, you will notice that mfx takes much longer to run with the nodrop option. So we would rarely want to specify this option.
We use the nodrop option to specify that mfx keep all the observations in memory during its calculations. Let's see how it works:
. webuse sysdsn3, clear (Health insurance data) . mlogit insure age male nonwhite site2 site3, nolog Multinomial logistic regression Number of obs = 615 LR chi2(10) = 42.99 Prob > chi2 = 0.0000 Log likelihood = -534.36165 Pseudo R2 = 0.0387 ------------------------------------------------------------------------------ insure | Coef. Std. Err. z p>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- Prepaid | age | -.011745 .0061946 -1.90 0.058 -.0238862 .0003962 male | .5616934 .2027465 2.77 0.006 .1643175 .9590693 nonwhite | .9747768 .2363213 4.12 0.000 .5115955 1.437958 site2 | .1130359 .2101903 0.54 0.591 -.2989296 .5250013 site3 | -.5879879 .2279351 -2.58 0.010 -1.034733 -.1412433 _cons | .2697127 .3284422 0.82 0.412 -.3740222 .9134476 -------------+---------------------------------------------------------------- Uninsure | age | -.0077961 .0114418 -0.68 0.496 -.0302217 .0146294 male | .4518496 .3674867 1.23 0.219 -.268411 1.17211 nonwhite | .2170589 .4256361 0.51 0.610 -.6171725 1.05129 site2 | -1.211563 .4705127 -2.57 0.010 -2.133751 -.2893747 site3 | -.2078123 .3662926 -0.57 0.570 -.9257327 .510108 _cons | -1.286943 .5923219 -2.17 0.030 -2.447872 -.1260135 ------------------------------------------------------------------------------ (Outcome insure==Indemnity is the comparison group) . mfx, predict(p outcome(1)) diagnostics(drop) Predict into observation 1 = .48179251 Predict into obs 1 after drop = .48179251 Keep first e(sample) observation only. Marginal effects after mlogit y = Pr(insure==1) (predict, p outcome(1)) = .48179251 ------------------------------------------------------------------------------ variable | dy/dx Std. Err. z p>|z| [ 95% C.I. ] X ---------+-------------------------------------------------------------------- age | .0028073 .00148 1.90 0.058 -.000096 .005711 44.4683 male*| -.1347111 .04683 -2.88 0.004 -.226494 -.042929 .250407 nonwhite*| -.2138472 .05074 -4.21 0.000 -.313297 -.114397 .196748 site2*| .0096603 .05082 0.19 0.849 -.089942 .109263 .370732 site3*| .1333108 .05294 2.52 0.012 .029558 .237064 .313821 ------------------------------------------------------------------------------ (*) dy/dx is for discrete change of dummy variable from 0 to 1 . mfx, predict(p outcome(1)) diagnostics(drop) nodrop Predict into observation 1 = .48179251 Predict into obs 1 after drop = .48179251 All e(sample) observations kept: N = 615 Marginal effects after mlogit y = Pr(insure==1) (predict, p outcome(1)) = .48179251 ------------------------------------------------------------------------------ variable | dy/dx Std. Err. z p>|z| [ 95% C.I. ] X ---------+-------------------------------------------------------------------- age | .0028073 .00148 1.90 0.058 -.000096 .005711 44.4683 male*| -.1347111 .04683 -2.88 0.004 -.226494 -.042929 .250407 nonwhite*| -.2138472 .05074 -4.21 0.000 -.313297 -.114397 .196748 site2*| .0096603 .05082 0.19 0.849 -.089942 .109263 .370732 site3*| .1333108 .05294 2.52 0.012 .029558 .237064 .313821 ------------------------------------------------------------------------------ (*) dy/dx is for discrete change of dummy variable from 0 to 1
The results are the same, as we would expect, but if you run this example, you will notice that mfx takes much longer to run with the nodrop option. So we would rarely want to specify this option.