Home  /  Resources & support  /  FAQs  /  Marginal effects and the nodrop option
This FAQ is only applicable for Stata 8 or earlier versions of Stata.

What does the nodrop option do in mfx?

Title   Marginal effects and the nodrop option
Author May Boggess, StataCorp
Date April 2004

A marginal effect is a derivative of a function. By using the predict option of mfx, we specify the function for which we would like marginal effects. If none was specified, the default prediction option for the preceding estimation command is used.

In calculating a marginal effect, mfx must use the predict command to obtain the values of the prediction function. The predict command works by creating a new variable and putting the predicted value for each observation into that new variable. So, the prediction goes into the dataset.

How many observations does mfx need to predict into in order to function properly? Well, most of the time, the answer is one, and to save computation time, mfx temporarily drops all the observations except the first one in the e(sample), if it is safe to do so.

How do you know if mfx has concluded that it is safe to work with only one observation? You can use the diagnostics(drop) option. Let's look at an example:

        . webuse friedman2, clear
        
        . keep if tin( ,1981q4)  
        (67 observations deleted)
        
        . arima consump m2, ar(1)  ma(1) nolog  
        
        ARIMA regression
        
        Sample:  1959q1 to 1981q4                       Number of obs      =        92
                                                        Wald chi2(3)       =   4394.80
        Log likelihood = -340.5077                      Prob > chi2        =    0.0000
        
        ------------------------------------------------------------------------------
                     |                 OPG
        consump      |      Coef.   Std. Err.      z    p>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
        consump      |
        m2           |   1.122029   .0363563    30.86   0.000     1.050772    1.193286
        _cons        |  -36.09872   56.56703    -0.64   0.523    -146.9681    74.77062
        -------------+----------------------------------------------------------------
        ARMA         |
        ar           |
                  L1 |   .9348486   .0411323    22.73   0.000     .8542308    1.015467
        ma           |
                  L1 |   .3090592   .0885883     3.49   0.000     .1354293    .4826891
        -------------+----------------------------------------------------------------
              /sigma |   9.655308   .5635157    17.13   0.000     8.550837    10.75978
        ------------------------------------------------------------------------------
        
        . mfx, predict(xb structural) diagnostics(drop)
        
                 Predict into observation 1 = 828.33238
        
        Predict error after drop.
        
        note: nodrop option enforced.
        
        All e(sample) observations kept: N =    92
        
        Marginal effects after arima
              y  = xb prediction, structural one-step (predict, xb structural)
                 =  828.33239
        ------------------------------------------------------------------------------
        variable |      dy/dx    Std. Err.     z    p>|z|  [    95% C.I.   ]      X
        ---------+--------------------------------------------------------------------
              m2 |   1.122029      .03636   30.86   0.000   1.05077  1.19329   770.418
        ------------------------------------------------------------------------------

We see that we get a predict error after keeping only one observation. Let's use predict by itself and see if the same thing happens:

        . webuse friedman2, clear
        
        . keep if tin( ,1981q4)  
        (67 observations deleted)
        
        . quietly arima consump m2, ar(1)  ma(1) nolog  
        
        . keep if e(sample)
        (52 observations deleted)
        
        . keep in 1
        (91 observations deleted)
        
        . predict xb, xb structural
        Obs. nos. out of range
        r(198);

If we rerun predict with set trace on, we see that it is referring to the other observations, which are now not there. So, in this example, it is certainly safest to keep all observations in memory during mfx. A marginal effect is a derivative of a function. By using the predict option of mfx, we specify the function for which we would like marginal effects. If none was specified, the default prediction option for the preceding estimation command is used.

In calculating a marginal effect, mfx must use the predict command to obtain the values of the prediction function. The predict command works by creating a new variable and putting the predicted value for each observation into that new variable. So, the prediction goes into the dataset.

How many observations does mfx need to predict into in order to function properly? Well, most of the time, the answer is one, and to save computation time, mfx temporarily drops all the observations except the first one in the e(sample), if it is safe to do so.

How do you know if mfx has concluded that it is safe to work with only one observation? You can use the diagnostics(drop) option. Let's look at an example:

        . webuse friedman2, clear
        
        . keep if tin( ,1981q4)  
        (67 observations deleted)
        
        . arima consump m2, ar(1)  ma(1) nolog  
        
        ARIMA regression
        
        Sample:  1959q1 to 1981q4                       Number of obs      =        92
                                                        Wald chi2(3)       =   4394.80
        Log likelihood = -340.5077                      Prob > chi2        =    0.0000
        
        ------------------------------------------------------------------------------
                     |                 OPG
        consump      |      Coef.   Std. Err.      z    p>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
        consump      |
        m2           |   1.122029   .0363563    30.86   0.000     1.050772    1.193286
        _cons        |  -36.09872   56.56703    -0.64   0.523    -146.9681    74.77062
        -------------+----------------------------------------------------------------
        ARMA         |
        ar           |
                  L1 |   .9348486   .0411323    22.73   0.000     .8542308    1.015467
        ma           |
                  L1 |   .3090592   .0885883     3.49   0.000     .1354293    .4826891
        -------------+----------------------------------------------------------------
              /sigma |   9.655308   .5635157    17.13   0.000     8.550837    10.75978
        ------------------------------------------------------------------------------
        
        . mfx, predict(xb structural) diagnostics(drop)
        
                 Predict into observation 1 = 828.33238
        
        Predict error after drop.
        
        note: nodrop option enforced.
        
        All e(sample) observations kept: N =    92
        
        Marginal effects after arima
              y  = xb prediction, structural one-step (predict, xb structural)
                 =  828.33239
        ------------------------------------------------------------------------------
        variable |      dy/dx    Std. Err.     z    p>|z|  [    95% C.I.   ]      X
        ---------+--------------------------------------------------------------------
              m2 |   1.122029      .03636   30.86   0.000   1.05077  1.19329   770.418
        ------------------------------------------------------------------------------

We see that we get a predict error after keeping only one observation. Let's use predict by itself and see if the same thing happens:

        . webuse friedman2, clear
        
        . keep if tin( ,1981q4)  
        (67 observations deleted)
        
        . quietly arima consump m2, ar(1)  ma(1) nolog  
        
        . keep if e(sample)
        (52 observations deleted)
        
        . keep in 1
        (91 observations deleted)
        
        . predict xb, xb structural
        Obs. nos. out of range
        r(198);

If we rerun predict with set trace on, we see that it is referring to the other observations, which are now not there. So, in this example, it is certainly safest to keep all observations in memory during mfx.

We use the nodrop option to specify that mfx keep all the observations in memory during its calculations. Let's see how it works:

        . webuse sysdsn3, clear
        (Health insurance data)
        
        . mlogit insure age male nonwhite site2 site3, nolog
        
        Multinomial logistic regression                   Number of obs   =        615
                                                          LR chi2(10)     =      42.99
                                                          Prob > chi2     =     0.0000
        Log likelihood = -534.36165                       Pseudo R2       =     0.0387
        
        ------------------------------------------------------------------------------
              insure |      Coef.   Std. Err.      z    p>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
        Prepaid      |
                 age |   -.011745   .0061946    -1.90   0.058    -.0238862    .0003962
                male |   .5616934   .2027465     2.77   0.006     .1643175    .9590693
            nonwhite |   .9747768   .2363213     4.12   0.000     .5115955    1.437958
               site2 |   .1130359   .2101903     0.54   0.591    -.2989296    .5250013
               site3 |  -.5879879   .2279351    -2.58   0.010    -1.034733   -.1412433
               _cons |   .2697127   .3284422     0.82   0.412    -.3740222    .9134476
        -------------+----------------------------------------------------------------
        Uninsure     |
                 age |  -.0077961   .0114418    -0.68   0.496    -.0302217    .0146294
                male |   .4518496   .3674867     1.23   0.219     -.268411     1.17211
            nonwhite |   .2170589   .4256361     0.51   0.610    -.6171725     1.05129
               site2 |  -1.211563   .4705127    -2.57   0.010    -2.133751   -.2893747
               site3 |  -.2078123   .3662926    -0.57   0.570    -.9257327     .510108
               _cons |  -1.286943   .5923219    -2.17   0.030    -2.447872   -.1260135
        ------------------------------------------------------------------------------
        (Outcome insure==Indemnity is the comparison group)
        
        . mfx, predict(p outcome(1)) diagnostics(drop)
        
                 Predict into observation 1 = .48179251
        
              Predict into obs 1 after drop = .48179251
        
        Keep first e(sample) observation only.
        
        Marginal effects after mlogit
              y  = Pr(insure==1) (predict, p outcome(1))
                 =  .48179251
        ------------------------------------------------------------------------------
        variable |      dy/dx    Std. Err.     z    p>|z|  [    95% C.I.   ]      X
        ---------+--------------------------------------------------------------------
             age |   .0028073      .00148    1.90   0.058  -.000096  .005711   44.4683
            male*|  -.1347111      .04683   -2.88   0.004  -.226494 -.042929   .250407
        nonwhite*|  -.2138472      .05074   -4.21   0.000  -.313297 -.114397   .196748
           site2*|   .0096603      .05082    0.19   0.849  -.089942  .109263   .370732
           site3*|   .1333108      .05294    2.52   0.012   .029558  .237064   .313821
        ------------------------------------------------------------------------------
        (*) dy/dx is for discrete change of dummy variable from 0 to 1
        
        . mfx, predict(p outcome(1)) diagnostics(drop) nodrop 
        
                 Predict into observation 1 = .48179251
        
              Predict into obs 1 after drop = .48179251
        
        All e(sample) observations kept: N =   615
        
        Marginal effects after mlogit
              y  = Pr(insure==1) (predict, p outcome(1))
                 =  .48179251
        ------------------------------------------------------------------------------
        variable |      dy/dx    Std. Err.     z    p>|z|  [    95% C.I.   ]      X
        ---------+--------------------------------------------------------------------
             age |   .0028073      .00148    1.90   0.058  -.000096  .005711   44.4683
            male*|  -.1347111      .04683   -2.88   0.004  -.226494 -.042929   .250407
        nonwhite*|  -.2138472      .05074   -4.21   0.000  -.313297 -.114397   .196748
           site2*|   .0096603      .05082    0.19   0.849  -.089942  .109263   .370732
           site3*|   .1333108      .05294    2.52   0.012   .029558  .237064   .313821
        ------------------------------------------------------------------------------
        (*) dy/dx is for discrete change of dummy variable from 0 to 1

The results are the same, as we would expect, but if you run this example, you will notice that mfx takes much longer to run with the nodrop option. So we would rarely want to specify this option.

We use the nodrop option to specify that mfx keep all the observations in memory during its calculations. Let's see how it works:

        . webuse sysdsn3, clear
        (Health insurance data)
        
        . mlogit insure age male nonwhite site2 site3, nolog
        
        Multinomial logistic regression                   Number of obs   =        615
                                                          LR chi2(10)     =      42.99
                                                          Prob > chi2     =     0.0000
        Log likelihood = -534.36165                       Pseudo R2       =     0.0387
        
        ------------------------------------------------------------------------------
              insure |      Coef.   Std. Err.      z    p>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
        Prepaid      |
                 age |   -.011745   .0061946    -1.90   0.058    -.0238862    .0003962
                male |   .5616934   .2027465     2.77   0.006     .1643175    .9590693
            nonwhite |   .9747768   .2363213     4.12   0.000     .5115955    1.437958
               site2 |   .1130359   .2101903     0.54   0.591    -.2989296    .5250013
               site3 |  -.5879879   .2279351    -2.58   0.010    -1.034733   -.1412433
               _cons |   .2697127   .3284422     0.82   0.412    -.3740222    .9134476
        -------------+----------------------------------------------------------------
        Uninsure     |
                 age |  -.0077961   .0114418    -0.68   0.496    -.0302217    .0146294
                male |   .4518496   .3674867     1.23   0.219     -.268411     1.17211
            nonwhite |   .2170589   .4256361     0.51   0.610    -.6171725     1.05129
               site2 |  -1.211563   .4705127    -2.57   0.010    -2.133751   -.2893747
               site3 |  -.2078123   .3662926    -0.57   0.570    -.9257327     .510108
               _cons |  -1.286943   .5923219    -2.17   0.030    -2.447872   -.1260135
        ------------------------------------------------------------------------------
        (Outcome insure==Indemnity is the comparison group)
        
        . mfx, predict(p outcome(1)) diagnostics(drop)
        
                 Predict into observation 1 = .48179251
        
              Predict into obs 1 after drop = .48179251
        
        Keep first e(sample) observation only.
        
        Marginal effects after mlogit
              y  = Pr(insure==1) (predict, p outcome(1))
                 =  .48179251
        ------------------------------------------------------------------------------
        variable |      dy/dx    Std. Err.     z    p>|z|  [    95% C.I.   ]      X
        ---------+--------------------------------------------------------------------
             age |   .0028073      .00148    1.90   0.058  -.000096  .005711   44.4683
            male*|  -.1347111      .04683   -2.88   0.004  -.226494 -.042929   .250407
        nonwhite*|  -.2138472      .05074   -4.21   0.000  -.313297 -.114397   .196748
           site2*|   .0096603      .05082    0.19   0.849  -.089942  .109263   .370732
           site3*|   .1333108      .05294    2.52   0.012   .029558  .237064   .313821
        ------------------------------------------------------------------------------
        (*) dy/dx is for discrete change of dummy variable from 0 to 1
        
        . mfx, predict(p outcome(1)) diagnostics(drop) nodrop 
        
                 Predict into observation 1 = .48179251
        
              Predict into obs 1 after drop = .48179251
        
        All e(sample) observations kept: N =   615
        
        Marginal effects after mlogit
              y  = Pr(insure==1) (predict, p outcome(1))
                 =  .48179251
        ------------------------------------------------------------------------------
        variable |      dy/dx    Std. Err.     z    p>|z|  [    95% C.I.   ]      X
        ---------+--------------------------------------------------------------------
             age |   .0028073      .00148    1.90   0.058  -.000096  .005711   44.4683
            male*|  -.1347111      .04683   -2.88   0.004  -.226494 -.042929   .250407
        nonwhite*|  -.2138472      .05074   -4.21   0.000  -.313297 -.114397   .196748
           site2*|   .0096603      .05082    0.19   0.849  -.089942  .109263   .370732
           site3*|   .1333108      .05294    2.52   0.012   .029558  .237064   .313821
        ------------------------------------------------------------------------------
        (*) dy/dx is for discrete change of dummy variable from 0 to 1

The results are the same, as we would expect, but if you run this example, you will notice that mfx takes much longer to run with the nodrop option. So we would rarely want to specify this option.