--- georg wernicke wrote:
> Verbeek(2000) argues that the selection equation should at least
> contain all the variables the structural equation contains. however,
> Linder and de Groot (2006) argue that the variables of the two parts
> can be different.
This answer would be a lot more informative if you included the
complete references.
--- Seema Bhatia wrote:
> Also, how does one verify that this 'identifying' variable that seperates
> the two equations is valid in the sense that it determines whether that case
> is selected or not but does not determine the LHS in the second step?
--- georg wernicke wrote:
> the unique variable the selection process should contain is probably a
> dummy which is used as the selection identifier. lets say you data for
> workers, some work some are unemployed. then create a dummy whether
> the worker has work or not and use this in the selection equation as
> the identifier.
The identifying variables mean something different here: these are the
variables that influence the probability of being selected but not the
outcome of equation of interest; this assumption make sure that the
model is identified. It is not a variable that identifies which
observation is selected and which is not. The latter variable is
unnecessary when using -heckman- (the observations with a missing value
on the dependent variable are not selected, all others are.)
To answer Seema's original question: These types of models try to control
for things you have not observed. As a result you do not have all the
necessary information available in your dataset. The information you are
missing comes from assumptions/theory, in this case the assumption that
the identifying variable only influences the probability. If you could
empirically verify that your identifying variable was good, you would not
need -heckman-. This leads to a catch-22 situation: you either have to
use heckman, but than you can't verify the identifying variable; or you
can verify the identifying variable, but than you should not use -heckman-.
So if you have to use -heckman-, an important part of the information
contained in the parameter estimates do not come from your data, but from
your theory. As a consequence I see -heckman- as primarily a theoretical
exercise with a limited amount of empirical content, instead of an
empirical estimate.
hope it helps,
Maarten
-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands
visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434
+31 20 5986715
http://home.fsw.vu.nl/m.buis/
-----------------------------------------
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/