Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Tinna Asgeirsdottir <statalist.tla@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Imputation using ML for a lognormal ordered income variable |
Date | Mon, 19 Nov 2012 15:34:50 +0000 |
Thanks for the helpful reply Stas, I don´t think the recommendation referred to interval regression or multiple imputation. I think it referred to imputing the probable average or median of each category, but without the obviously false assumption of a uniform distribution within each category the midpoint would suggest. If I do a ML fit of a lognormal distribution using the lognfit command I can get the parameters of the distribution. I guess I should be able to work this out by hand from there, but figured that there might be an easier way. Best Tinna 2012/11/17 Stas Kolenikov <skolenik@gmail.com>: > Lognormal distribution will likely underestimate how heavy the top > tail is (although if you are interested in Iceland, you may have a > very egalitarian income distribution, so the shape of that tail may > not be that terrible). Lognormal distribution is a very cute model to > play with and very dangerous in real work. In my work on Russian data, > changing the assumptions about the top tail moved our Gini index from > 0.48 to 0.60... and that's a little bit of a difference, let's put it > this way. > > The recommendation you have heard probably concerns -intreg-, which > you can read the help on. > > Imputing the mean income over a group will lead to a multitude of > problems due to artificially compressed variability and values that > are simply too low for the top group. If you desperately need to > impute, you would want to go with multiple imputations (-help mi-), > although you would want to read the MI manual and a paper > (http://www.citeulike.org/user/ctacmo/article/8525275) or two > (http://www.jstor.org/stable/2291635) if you are not familiar with the > technique. What I have done in one of my projects recently was to > generate the plausible values of the variable of interest a bunch of > times (say, 50... the original suggestion to use 5 imputations dates > back to late 1970s... and your smartphone now has more computing power > than a then-Cray supercomputer) and make Stata believe they were > imputed in Stata mi wide format. > > -- > -- Stas Kolenikov, PhD, PStat (SSC) :: http://stas.kolenikov.name > -- Senior Survey Statistician, Abt SRBI :: work email kolenikovs at > srbi dot com > -- Opinions stated in this email are mine only, and do not reflect the > position of my employer > > > On Sat, Nov 17, 2012 at 6:12 AM, Tinna Asgeirsdottir > <statalist.tla@gmail.com> wrote: >> Dear Stata users, >> >> In my data I have income in 13 groups. The top group is open ended. I >> am trying to impute sensible values and would like to use this as a >> continuous variable. I am especially concerned about the top category. >> It has been suggested to me that I should use STATA´s ML command in >> stead of using each categories mid-point. I am having trouble finding >> what I need on the internet. Thus I wonder if anyone can tell me how >> to fit a lognormal distribution to the variable and subsequently infer >> the average income in the top bracket. If you know how to do this in >> general for all the categories that is great as well as the >> distributions over the other brackets is surely not uniform. However, >> I think finding a good solution for my top category is the most >> important thing though. >> >> Best regards, >> Tinna >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/