Dear Austin
Many thanks for your response. I will check out the model you suggest to see if it's still appropriate given that data for my sellers are continuous.
Jennifer
"Austin Nichols" <[email protected]>:
How about using -ivpois- from SSC instead? Non-sellers are those who
sell zero units; in a Poisson regression that fits the data well, they
would have very small predicted sales, so that the rounded predicted
value would be zero. -ivpois- implements a GMM version of Poisson
that allows endogenous regressors and excluded instruments. Such a
choice is similar to a choice among a two-part model, -zip-, or
-poisson-; see also http://www.nber.org/papers/t0228 or
Mullahy, John. 1998. "Much Ado About Two: Reconsidering
Retransformation And The Two-Part Model In Health Econometrics,"
Journal of Health Economics, 17(3): 247-281.
On Fri, Mar 28, 2008 at 12:33 PM, Jennifer Leavy <[email protected]> wrote:
> Dear Statalisters
> I am trying to estimate a model of market participation (sellers, non-sellers: given that someone sells, how much are they selling?) addressing the following issues:
>
> i) complex survey design (PSUs and pweights only)
> ii) sample selection bias
> iii) potential reverse causality between regressors and dependent variable
>
> To be able to use instrumental variables I think I will need to estimate
the model in two steps ('by hand') rather than using the heckman command.
However, because of the inverse mills ratio in the outcome equation,
this means that I also need to make an adjustment to the covariance matrix
of the outcome equation so that I get correct standard errors. I've looked
through stata FAQs and statalist and trawled the internet and the closest I
can find to what I want to do is set out below, minus the IV part of the
estimation for now for simplicity (I took the syntax from Vince Wiggins'
FAQ post "Must I use all of my exogenous variables as instruments when
estimating instrumental variables regression?")
However, there is a problem in that by using svy:regress Stata does not seem
to give e(rmse) so the new Vmatrix ends up empty. Is there a way of recovering the estimated rmse so I can plug it into the formula? Or is there a better way for me to do this? I have been grappling with this for some time, so any help (solutions, or encouragement to let this one go) very much appreciated.
Many thanks
Jennifer
The syntax:
/*selection equation*/
svy: probit y2 x w
predict Z, xb /*fitted values*/
gen mills=normden(Z)/norm(Z)
/*Outcome equation*/
svy: regress y1 mills x if y2==1
set more off
rename Z y2hold
rename y2 Z
predict double res, residual
rename Z y2
rename y2hold Z
replace res=res^2
summ res
scalar realmse = r(mean)*r(N)/e(df_r)
matrix bmatrix = e(b)
matrix Vmatrix = e(V)
matrix Vmatrix = e(V) * realmse /e(rmse)^2 /*stata does not return e(rmse) - dividing by zero in that case*/
ereturn post bmatrix Vmatrix, noclear /*so the Vmatrix is empty*/
ereturn display
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/