Title | Bivariate probit with partial observability and a single dependent variable | |
Author | Vince Wiggins and Brian Poi, StataCorp |
I’m trying to estimate a bivariate probit with partial observability following Abowd and Farber (1982), Maddala (1983), and Poirier (1980). The problem is that we have only one dependent variable (the product of the two latent dependent variables), and the biprobit command in Stata requires two different dependent variables!
The bivariate probit (biprobit) model has two binary dependent variables that we assume are correlated. Partial observability occurs when we can observe a positive outcome for only one of the dependent variables when the other is also positive. For example, assume y1 and y2 are our two dependent variables, and we have the following cross-tabulation of the outcomes:
. tabulate y1 y2
y2 | ||||
y1 | 0 1 | Total | ||
0 | 26 26 | 52 | ||
1 | 8 14 | 22 | ||
Total | 34 40 | 74 |
With partial observability, we know only 14 outcomes are positive for both y1 and y2. We could think of this as a single dependent variable, say y, that is the product of y1 and y2.
The user who raises this question says he does not have two dependent variables; his single dependent variable already reflects the partially observed data. He has a single dependent variable y with 14 positive outcomes and 60 zero outcomes.
The syntax for biprobit is designed so that we can fit a partial observability model whether we have complete data, such as y1 and y2 above, or the product of the two, such as y above. The partial observability model uses only the information from the product of the two dependent variables. So, if we already have that product, we can use any pair of dependent variables that, when multiplied together, produce the same set of positive outcomes observed in the product dependent variable, y.
Many other pairs of variables will do this, and any pair when multiplied to produce the pattern in y will imply the same partial observability model. biprobit will not, however, let us specify a dependent variable that is always 1. To duplicate y would be the easiest way to produce two binary variables that when multiplied together have the same pattern of 0s and 1s as our product variable y.
Taking the easy way and assuming the single product dependent variable is y, we can type
. generate y_copy = y . biprobit (y x1 x2 x3)(y_copy x1 x2 x4), partial
to estimate a bivariate probit model with partial observability. x1, x2, x3 are the covariates for the first dependent variable y1, and x1, x2, x4 are the covariates for the second dependent variable y2.
We use the syntax for a seemingly unrelated bivariate probit model, so we can specify different regressors for the equations for y1 and y2. With the partially observable variant of the model, we only observe the product of y1 and y2. The partially observable model is particularly difficult to estimate when the same set of regressors is used for both equations, and the parameters may not even be identified. Poirier (1980) discusses in detail identification for this model.