|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: imputing continuous values when respondents select categories, e.g., income category
At 10:23 PM 4/24/2009, Alan Acock wrote:
Richard Williams asked if I want to impute missing values or to plug
in values within each interval, as opposed to assigning everybody
the midpoint of the interval they select.
The latter is what I want to do and it appears that the intreg
command with the ystar(a,b) option in the post estimation commands
is exactly what I should use. This treats income as the dependent
variable, but once we estimate the value we can use that as an
independent variable in other models. At least this is my understanding.
Actually, I am not sure if that is the optimal strategy or not. At a
minimum, it seems there should be some sort of penalty for using
estimated income rather than real income. You'll also have
multicollinearity problems if all the vars used to compute estimated
income are also in your other models.
Maarten Buis did touch on these issues at the summer 2008 NASUG (but
I don't remember what he concluded!). See the first paper listed at
http://www.stata.com/meeting/snasug08/abstracts.html
Also, Powers & Xie discuss this sort of thing in section 6.2 of their
book ("Statistical Methods for Categorical Data Analysis"). They
propose a "normal score transformation" which in turn comes from
Clogg & Shihadeh 1994 ("Statistical Models for Ordinal
Variables"). There is no discussion of how much better it actually
works though.
-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME: (574)289-5227
EMAIL: [email protected]
WWW: http://www.nd.edu/~rwilliam
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/