|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: iv in heckman model
Dear Statalist users,
I want to study a continuous occupational status score (measured on the
CAMSIS scale). There is a potential selection problem, because we do not
observe occuptional scores for people who don’t work. Studying
occupational status score without taking selection into account would
probably lead to upwardly biased estimations.
We are therefore trying to fit a heckman two-step model, with CAMSIS
score as the outcome of the main equation and ‘working’ as the binary
outcome of the selection equation. Our instrumental variable is the
number of children in the household.
As it turns out, example 1 in the Stata manual under the ‘heckman’ entry
is very similar to our case. In that example, the number of children is
also used as the iv, but the main outcome of interest is wage whereas
ours is occupational status score.
However, if my understanding of the heckman method is right, one
condition is that the instrumental variable must have a significant
impact on the selection outcome, but NOT on the main outcome. This
condition is not addressed nor tested in the example in the STATA
manual. Thus I downloaded the example dataset ‘womenwk’ and first ran
the Heckman model used in the manual and got identical results (pg 556,
reference manual A-H release 10).
However, a simple regression of wage on the exogenous variables and the
iv ‘children’, to see if the iv affects wage, shows that in fact it does
(see below). So, while we would agree that the number of children is a
good theoretical choice, it does not seem to meet the empirical
requirements of the Heckman approach. Or am I wrong? Or am I testing
this in the wrong way?
The obvious reason for asking is that I have found the same problem in
our analysis as the number of children is significant in the selection
model, but is also significantly related to the occupational score in a
standard regression model.
Any help would be appreciated.
reg wage education age children married
Source | SS df MS Number of obs = 1343
-------------+------------------------------ F( 4, 1338) = 128.55
Model | 14812.5356 4 3703.1339 Prob > F = 0.0000
Residual | 38542.3591 1338 28.8059485 R-squared = 0.2776
-------------+------------------------------ Adj R-squared = 0.2755
Total | 53354.8946 1342 39.7577456 Root MSE = 5.3671
------------------------------------------------------------------------------
wage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
education | .8750694 .050243 17.42 0.000 .7765057 .973633
age | .1514818 .0192717 7.86 0.000 .1136757 .1892879
children | -.6862982 .1032256 -6.65 0.000 -.8887997 -.4837966
married | -.5395024 .3574519 -1.51 0.131 -1.24073 .1617247
_cons | 7.934369 .9264515 8.56 0.000 6.116914 9.751825
------------------------------------------------------------------------------
--
****************************
Peteke Feijten
Research and user support for the Scottish Longitudinal Study (www.lscs.ac.uk)
University of St Andrews
School of Geography & Geosciences
North Street, St Andrews, KY16 9AL
Phone: 01334 463951
Email: [email protected]
The University of St Andrews is a charity registered in Scotland:
No SC013532
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/