----- Original Message -----
From: <[email protected]>
To: <[email protected]>
Sent: Friday, August 29, 2003 5:05 AM
Subject: st: Interpretation of OLS coeff after Heckman selection
> Hi everyone,
>
> My dependent variable, Y, is the log of expenditures and a set of dummies
> (X1, X2, ...) are the explanatory variables of main concern. I also have a
> bunch of controls.
>
> Since sample selection is a problem, I use the Heckman command. (Tobit does
> not work with these data.)
>
> Recently someone pointed out to me the following: One cannot interpret the
> OLS coefficients for X1, X2, ... in the consumption equation the usual way
> (here: as semilogarithmic coefficients that need the adjustment suggested
> by Halvorsen and Palmquist [1980]) WHEN X1, X2, ... also are included as
> explanatory variables in the (probit) selection equation (which they are in
> my case). In this case, the OLS coefficients in the consumption needs to be
> adjusted according to som kind of formula....
>
> Is this true? If yes, has anyone seen such a formula? Finally, has anyone
> written a command or a ado/do file to perform this adjustment in Stata?
>
> Thanks for any help!
>
> Christer
>
Yes, it is true. The marginal effect on Y is composed of the effect on the
selection equation and the outcome equation. (See Greene's Econometric
Analysis)
I believe the correct procedure is as follows:
If the outcome coefficient is beta and the selection coefficient is alpha, then
dE[y| z*>0]/dx = beta - (alpha*rho*simga*delta(alpha))
where delta(alpha) = inverse Mills' ratio *(inverse Mills' ratio * selection
prediction)
Example
. use http://www.stata-press.com/data/r8/womenwk.dta
. heckman wage educ age, select(married children educ age) mills(mills)
Iteration 0: log likelihood = -5178.7009
Iteration 1: log likelihood = -5178.3049
Iteration 2: log likelihood = -5178.3045
Heckman selection model Number of obs = 2000
(regression model with sample selection) Censored obs = 657
Uncensored obs = 1343
Wald chi2(2) = 508.44
Log likelihood = -5178.304 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
wage |
education | .9899537 .0532565 18.59 0.000 .8855729 1.094334
age | .2131294 .0206031 10.34 0.000 .1727481 .2535108
_cons | .4857752 1.077037 0.45 0.652 -1.625179 2.59673
-------------+----------------------------------------------------------------
select |
married | .4451721 .0673954 6.61 0.000 .3130794 .5772647
children | .4387068 .0277828 15.79 0.000 .3842534 .4931601
education | .0557318 .0107349 5.19 0.000 .0346917 .0767718
age | .0365098 .0041533 8.79 0.000 .0283694 .0446502
_cons | -2.491015 .1893402 -13.16 0.000 -2.862115 -2.119915
-------------+----------------------------------------------------------------
/athrho | .8742086 .1014225 8.62 0.000 .6754241 1.072993
/lnsigma | 1.792559 .027598 64.95 0.000 1.738468 1.84665
-------------+----------------------------------------------------------------
rho | .7035061 .0512264 .5885365 .7905862
sigma | 6.004797 .1657202 5.68862 6.338548
lambda | 4.224412 .3992265 3.441942 5.006881
------------------------------------------------------------------------------
LR test of indep. eqns. (rho = 0): chi2(1) = 61.20 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
. predict select_xb , xbs
. gen delta = mills*(mills + select_xb)
. gen b_age = [wage]_b[age] - ([select]_b[age]*e(rho)*e(sigma)*delta)
. ci b
Variable | Obs Mean Std. Err. [95% Conf. Interval]
-------------+---------------------------------------------------------------
b_age | 2000 .1391227 .0006604 .1378276 .1404179
Hope this helps,
Scott
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/