--- On Wed, 17/2/10, Maria Quattri wrote:
> 1) Both the coefficients for the Probit and those for the
> OLS seem to have no direct interpretation. Therefore,
> I would consider the significance of marginal effects only:
> Pr(y observed) for the Probit and E(y|y observed) for
> the OLS. Is that right?
No, especially E(y | y observed) is usually not the most
interesting outcome, more often you would want to look at
either E(y) (the dependent variable as one would observe
them, thus including the censored observations), or E(y*)
(the latent dependent variable). When trying to understand
your results you want to look at all of them.
> 2) Is there any way to test the bivariate normality of the
> error terms for the maximum likelihood estimation in Stata?
No, where would that information come from? Think about the
selection equation: The empricial information about the
distribution of the "error term" is only there in the form
of the shape of the relationship between the probability
and the explanatory variables. That is just not enough to
build a reliable test.
> 3) While Stata twostep option automatically corrects
> standard errors after the inverse Mills ratio enters the
> regression as estimated parameter (i.e. bootstrapping is not
> necessary), the twostep does not allow robust estimation.
> This seems to suggest that running Heckman manually
> (Probit+OLS with robust s.e. and boostrap, say 1000
> replications) could be better option for inference. Is it
> so?
No, the bootstrap won't use the information from the robust
standard errors, and point estimates will be exactly the
same the models without robust standard errors. So this
procedure will not get you what you want. Moreover, robust
standard errors are not so great to be worth going through
any special effort (with the danger of introducing bugs).
Some people think that robust standard errors are the
greatest thing since sliced bread, others think they are
evil (and most hold a position somewhere in between). See
for example:
Freeman, D.A. (2006) On the So-Called "Huber Sandwich
Estimator" and "Robust Standard Errors", The American
Statistician, 60(4): 299--303.
> 4) The robust MLE is less general than the two-step, yet it
> seems to be preferred apart from when the estimated rho
> approaches 1. Which value for rho is "big enough" to suggest
> the use of the twostep procedure?
Again, don't get too carried away with those robust standard
errors. If you can use them without doing anything you don't
want to do, then they probably don't do too much harm,
otherwise just forget about them.
Regarding rho, rho is a correlation, so when rho is close to
1 (or -1) it means that there is very little information you
can use to distinguish between those two error terms. So use
what you know about correlations to make that judgement.
Hope this helps,
Maarten
--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany
http://www.maartenbuis.nl
--------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/