|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: AW: st: imputing continuous values when respondents select categories, e.g., income category
At 03:29 PM 4/24/2009, Alan Acock wrote:
Thanks for recommending intreg. I was not familiar with this
command. When I run the example in the reference manual, the
estimated income is not always within the limits of the interval the
respondent selected. Using their example and predicting est_income,
here are the first 10 observations:
+---------------------------+
| wage1 wage2 est_wage |
|---------------------------|
1. | . 5 4.266946 |
2. | 5 10 7.502522 |
3. | 5 10 10.19239 |
4. | 10 15 8.924339 |
5. | . 5 8.116896 |
|---------------------------|
6. | . 5 10.2202 |
7. | . 5 10.35355 |
8. | 5 10 -2.894233 |
9. | 5 10 13.32243 |
10. | 5 10 9.792462 |
Some estimated values are well outside of the interval. Is it
recommended to simply replace these with the interval boundaries?
There is nothing that says the estimated wages have to lie within the
interval. The est_wage is just E(Y | X). Just as in regular
regression, the estimates will sometimes be higher than the observed
value, sometimes lower. My guess is that replacing them with the
interval boundaries would be a terrible idea.
The intreg approach might be good if income was your dependent
variable, but by the way you are talking I'm guessing you really want
it as an independent variable.
Sometimes people just use the midpoint of the interval; other times
they will just break income up into dummy variables. I believe there
are other more advanced methods but I am not that familiar with them.
-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME: (574)289-5227
EMAIL: [email protected]
WWW: http://www.nd.edu/~rwilliam
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/