Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: selection bias models for incomplete datasets


From   "David Quinn" <[email protected]>
To   <[email protected]>
Subject   st: selection bias models for incomplete datasets
Date   Wed, 15 Feb 2006 14:52:22 -0500

Hello... 

My name is David Quinn, and I am a grad student in the Department of
Government and Politics at the University of Maryland, College Park.  I
am writing a dissertation based on Minorities at Risk (MAR) data.  I
recently came across a 2003 article in Political Analysis (Vol. 11, Iss.
3) by Simon Hug, in which the author argues that the MAR dataset is one
dataset that by nature suffers from selection bias resulting from the
fact that it is incomplete due to the imposition of researcher criteria
for case inclusion (in the case of MAR, a communal group must be
mobilized and discriminated against to be included in the dataset, which
yields a total of 318 groups; Fearon and Laitin have identified over 800
communal groups, so selection is being done on the part of MAR).  Hence,
the sample of cases in the dataset are non-random by nature.  Hug says
that selection bias models should be run for MAR and other such datasets
to account for any bias that may have been introduced by selecting only
certain cases to be included in the data set.  However, because there is
only information on the cases IN the data set and the properties of the
unknown population need to be estimated with the selection equation, you
cannot run a Heckman or tobit model to account for this type of
selection bias.   

Anyhow...in his article, Hug describes an algorithm that he uses to
estimate a selection and outcome equation to account for the above-noted
selection bias.  He runs the models in LIMDEP and Gauss, neither of
which I am familiar with.  It looks like he uses truncated regression
and OLS, as well as Monte Carlo simulations.  But it also looks like he
programs in a bunch of additional parameters, which I'm not sure how to
do.  FYI: The models in my paper will be binary and multinomial logit
models.

Thus, my question(s) to you are: 1. Do you know of any STATA commands
that easily estimate such selection models?  or 2. Do you know of any
good sources to look at for describing the STATA commands that would be
involved with such selection models?    

Any assistance or insight that you can give would be greatly
appreciated.  And thanks so much for taking the time to read through and
think about the contents of this message to begin with.

Sincerely,

David Quinn
Department of Government and Politics
University of Maryland, College Park

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index