Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Dimitriy V. Masterov" <dvmaster@gmail.com> |
To | Statalist <statalist@hsphsun2.harvard.edu> |
Subject | st: comparing biprobit marginal effects CIs |
Date | Thu, 11 Apr 2013 18:25:08 -0700 |
I am trying to decide between 3-5 different ways of calculating the AME of treatment from a biprobit with endogenous treatment. At the bottom, there is a toy example using the private school attendance data that captures the problem with the code and output. The model seems to fit pretty well. Here's a short summary. I am calculating the AME with 1) margins, which gives an AME of .103 with 95% CI of (-.701, .907) 2) AME by hand #1, AME of .095 with (.092, 0.098) 3) AME by hand #2, AME of .095 with (.092, 0.098) 4) bootstrapped ATE using user-written biprobittreat: AME/ATE of .095 (-.501,.691) (Normal CI) (-.593,.352) (Percentile CI) (-.572,.372) (Bias-Corrected CI) Methods (1)-(3) were suggested by Austin Nichols in his presentation on binary regression (http://www.stata.com/meeting/chicago11/materials/chi11_nichols.pdf). biprobittreat, the GoF test, and the papers are available from Richard Chiburis' site (https://webspace.utexas.edu/rcc485/www/code.html). All the AMEs are close, but the CIs are fairly different. Method (1) is convenient, but probably wrong. Methods (2) and (3) use finite difference and give identical results. The bootstrapped AME matches (2) and (3), but the confidence intervals are much closer to what margins produces. Two questions. Which CIs would you prefer and why? Is the regression of the ME on constant in (2) and (3) the right way to construct the AME CI? Here's the code with output: . webuse school; . biprobit (private = years loginc vote) (vote = year loginc logptax ), robust; Fitting comparison equation 1: Iteration 0: log pseudolikelihood = -31.967097 Iteration 1: log pseudolikelihood = -30.915551 Iteration 2: log pseudolikelihood = -30.890586 Iteration 3: log pseudolikelihood = -30.890555 Iteration 4: log pseudolikelihood = -30.890555 Fitting comparison equation 2: Iteration 0: log pseudolikelihood = -63.036914 Iteration 1: log pseudolikelihood = -58.534843 Iteration 2: log pseudolikelihood = -58.497292 Iteration 3: log pseudolikelihood = -58.497288 Comparison: log pseudolikelihood = -89.387844 Fitting full model: Iteration 0: log pseudolikelihood = -89.387844 Iteration 1: log pseudolikelihood = -89.274953 Iteration 2: log pseudolikelihood = -89.21596 Iteration 3: log pseudolikelihood = -89.209977 Iteration 4: log pseudolikelihood = -89.209866 Iteration 5: log pseudolikelihood = -89.209866 Seemingly unrelated bivariate probit Number of obs = 95 Wald chi2(6) = 14.69 Log pseudolikelihood = -89.209866 Prob > chi2 = 0.0228 ------------------------------------------------------------------------------ | Robust | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- private | years | -.0082564 .0176725 -0.47 0.640 -.0428939 .0263811 loginc | .2227815 .7555846 0.29 0.768 -1.258137 1.7037 vote | .5056949 1.70658 0.30 0.767 -2.83914 3.85053 _cons | -3.64063 7.195005 -0.51 0.613 -17.74258 10.46132 -------------+---------------------------------------------------------------- vote | years | -.0175658 .0174327 -1.01 0.314 -.0517332 .0166017 loginc | .9860015 .4170414 2.36 0.018 .1686155 1.803388 logptax | -1.287585 .5158509 -2.50 0.013 -2.298634 -.2765358 _cons | -.4144151 4.354969 -0.10 0.924 -8.949997 8.121167 -------------+---------------------------------------------------------------- /athrho | -.6420293 1.542175 -0.42 0.677 -3.664636 2.380577 -------------+---------------------------------------------------------------- rho | -.5662797 1.047641 -.9986888 .9830337 ------------------------------------------------------------------------------ Wald test of rho=0: chi2(1) = .173318 Prob > chi2 = 0.6772 . scoregof; Murphy's score test for biprobit chi2(9) = 2.61 Prob > chi2 = 0.9778 . bphltest; Modified Hosmer-Lemeshow goodness-of-fit test for biprobit chi2( 21) = 27.63 Prob > chi2 = 0.1510 . /* AME Margins Way */ > margins, dydx(vote) predict(pmarg1) force; (note: prediction is a function of possibly stochastic quantities other than e(b)) Average marginal effects Number of obs = 95 Model VCE : Robust Expression : Pr(private=1), predict(pmarg1) dy/dx w.r.t. : vote ------------------------------------------------------------------------------ | Delta-method | dy/dx Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- vote | .103085 .4101601 0.25 0.802 -.700814 .906984 ------------------------------------------------------------------------------ . /* AME by Hand # 1 */ > predict double xb2, xb2; . // linear index for equation 2 > ren vote Tvote; . gen vote=0; . predict double p0, pmarg1; . // success for equation 1 with vote == 0 > predict double xb0, xb1; . // index for equation 1 with vote == 0 > > replace vote=1; (95 real changes made) . predict double p1, pmarg1; . // success for equation 1 with vote == 1 > predict double xb1, xb1; . // index for equation 1 with vote == 1 > gen double dp=p1-p0; . // calculate diff in > sum dp; Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- dp | 95 .0949194 .0151578 .0492599 .1230978 . reg dp; Source | SS df MS Number of obs = 95 -------------+------------------------------ F( 0, 94) = 0.00 Model | 0 0 . Prob > F = . Residual | .02159723 94 .000229758 R-squared = 0.0000 -------------+------------------------------ Adj R-squared = 0.0000 Total | .02159723 94 .000229758 Root MSE = .01516 ------------------------------------------------------------------------------ dp | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | .0949194 .0015552 61.04 0.000 .0918316 .0980072 ------------------------------------------------------------------------------ . /* AME by Hand # 2 */ > gen double pdx=(binormal(xb1,xb2,e(rho))-binormal(xb0,xb2,e(rho)))/normal(xb2) if Tvote==1; (95 missing values generated) . qui replace pdx=normal(xb1)-normal(xb0); . su pdx; Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- pdx | 95 .0949194 .0151578 .0492599 .1230978 . loc ATE2=r(mean); . reg pdx; Source | SS df MS Number of obs = 95 -------------+------------------------------ F( 0, 94) = 0.00 Model | 0 0 . Prob > F = . Residual | .02159723 94 .000229758 R-squared = 0.0000 -------------+------------------------------ Adj R-squared = 0.0000 Total | .02159723 94 .000229758 Root MSE = .01516 ------------------------------------------------------------------------------ pdx | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | .0949194 .0015552 61.04 0.000 .0918316 .0980072 ------------------------------------------------------------------------------ . replace vote=Tvote; (36 real changes made) . // set vote back to normal again > > /* AME using ATT */ > bootstrap _b ate=r(ate), reps(500) saving("bs_ate.dta", replace): biprobittreat (private = years loginc vote) (vote = year loginc logptax), robust; (running biprobittreat on estimation sample) Bootstrap replications (500) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100 .................................................. 150 .................................................. 200 .................................................. 250 .................................................. 300 .................................................. 350 .................................................x 400 .................................................. 450 .................................................. 500 Bootstrap results Number of obs = 95 Replications = 499 command: biprobittreat (private = years loginc vote) (vote = year loginc logptax), robust [_eq4]ate: r(ate) ------------------------------------------------------------------------------ | Observed Bootstrap Normal-based | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- private | years | -.0082564 .0344714 -0.24 0.811 -.075819 .0593062 loginc | .2227815 .7495176 0.30 0.766 -1.246246 1.691809 vote | .5056949 1.331318 0.38 0.704 -2.10364 3.115029 _cons | -3.64063 7.297287 -0.50 0.618 -17.94305 10.66179 -------------+---------------------------------------------------------------- vote | years | -.0175658 .0257062 -0.68 0.494 -.0679491 .0328176 loginc | .9860015 .5231484 1.88 0.059 -.0393505 2.011354 logptax | -1.287585 .6197584 -2.08 0.038 -2.502289 -.0728808 _cons | -.4144151 5.665756 -0.07 0.942 -11.51909 10.69026 -------------+---------------------------------------------------------------- athrho | _cons | -.6420293 234.5964 -0.00 0.998 -460.4424 459.1584 -------------+---------------------------------------------------------------- _eq4 | ate | .0949194 .3040172 0.31 0.755 -.5009434 .6907822 ------------------------------------------------------------------------------ Note: one or more parameters could not be estimated in 1 bootstrap replicate; standard-error estimates include only complete replications. . estat bootstrap, all Bootstrap results Number of obs = 95 Replications = 499 command: biprobittreat (private = years loginc vote) (vote = year loginc logptax), robust [_eq4]ate: r(ate) ------------------------------------------------------------------------------ | Observed Bootstrap | Coef. Bias Std. Err. [95% Conf. Interval] -------------+---------------------------------------------------------------- private | years | -.00825642 -.0080788 .03447135 -.075819 .0593062 (N) | -.087569 .0253931 (P) | -.082893 .0285328 (BC) loginc | .22278152 .1409217 .74951755 -1.246246 1.691809 (N) | -.8413433 1.90044 (P) | -.9458166 1.668028 (BC) vote | .50569492 -.5871846 1.3313176 -2.10364 3.115029 (N) | -2.308104 1.569678 (P) | -2.232462 1.608637 (BC) _cons | -3.6406301 -.9503618 7.2972871 -17.94305 10.66179 (N) | -20.36213 6.880115 (P) | -20.31908 7.248422 (BC) -------------+---------------------------------------------------------------- vote | years | -.01756575 -.0058208 .02570624 -.0679491 .0328176 (N) | -.0884118 .013394 (P) | -.0596505 .0210808 (BC) loginc | .98600148 .0904138 .5231484 -.0393505 2.011354 (N) | .0779691 2.152244 (P) | -.1330201 2.017503 (BC) logptax | -1.287585 -.1660917 .61975842 -2.502289 -.0728808 (N) | -2.795679 -.418406 (P) | -2.349517 -.0533561 (BC) _cons | -.41441506 .3055246 5.6657558 -11.51909 10.69026 (N) | -11.65625 10.40091 (P) | -12.30915 9.867423 (BC) -------------+---------------------------------------------------------------- athrho | _cons | -.64202932 -9.220272 234.59636 -460.4424 459.1584 (N) | -16.54541 18.6691 (P) | -17.94267 17.2223 (BC) -------------+---------------------------------------------------------------- _eq4 | ate | .09491939 -.1364229 .3040172 -.5009434 .6907822 (N) | -.5932178 .3518947 (P) | -.5716441 .3718321 (BC) ------------------------------------------------------------------------------ (N) normal confidence interval (P) percentile confidence interval (BC) bias-corrected confidence interval Note: one or more parameters could not be estimated in 1 bootstrap replicate; standard-error estimates include only complete replications. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/