I am using Stata 8.
I would like to know if anyone has advice on how to approach the following
model. I�m looking at school board elections, where I have data on all school
board elections in one state. I would like to test whether certain variables
improve a challenger�s share of the vote relative to an incumbent. There are
two stages of selection involved. First, the incumbent decides whether or not
to run for re-election. Second, if the incumbent runs, she may or may not face
a challenger. For races in which the incumbent runs *and* faces a challenger,
we observe the challenger�s vote share and the incumbent�s share. For what it�s
worth, the incumbent�s vote share is 100% whenever she runs and there is no
challenger; the challenger�s vote share is 100% whenever there is no incumbent.
But the interesting case is when there is both an incumbent and a challenger.
I�d like to model the challenger�s vote share as a function of other attributes
of the district. My question is how to estimate the selection model in Stata.
An additional wrinkle is that the vote share data are at the precinct level,
whereas the decision of the candidates to run is made at the district level. In
other words, there are J districts with Pj precints in each district (Pj not
equal for all J). When a candidate runs, she runs in every precinct in the
district. So the two-stages of selection occur at the district level (i.e., are
identical for every precinct within a district), but the vote share varies by
precinct.
So I could first estimate a probit model for incumbent running or not. Then
generate the inverse Mills ratio (IMR) and include that in a second probit model
estimating whether or not the race is contested (restricting estimation to cases
where there is an incumbent running). Both of these equations would be
estimated at the district level. I take the IMR from the second equation and
include it in the final vote share equation, which would be estimated at the
precinct level (the IMR, which would have been estimated at the district level
in step 2, would then be identical across all precincts in a district in the
vote share model). Estimation of the final equation would be restricted to
cases where both an incumbent and a challenger are running. But how to
calculate standard errors appropriately in all the equations? If anyone has
ideas about how to do this better, I�d be much obliged.
Thanks,
Chris
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/