Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Imputation using ML for a lognormal ordered income variable
From
Tinna Asgeirsdottir <[email protected]>
To
[email protected]
Subject
Re: st: Imputation using ML for a lognormal ordered income variable
Date
Mon, 19 Nov 2012 15:34:50 +0000
Thanks for the helpful reply Stas,
I don´t think the recommendation referred to interval regression or
multiple imputation. I think it referred to imputing the probable
average or median of each category, but without the obviously false
assumption of a uniform distribution within each category the midpoint
would suggest.
If I do a ML fit of a lognormal distribution using the lognfit command
I can get the parameters of the distribution. I guess I should be able
to work this out by hand from there, but figured that there might be
an easier way.
Best
Tinna
2012/11/17 Stas Kolenikov <[email protected]>:
> Lognormal distribution will likely underestimate how heavy the top
> tail is (although if you are interested in Iceland, you may have a
> very egalitarian income distribution, so the shape of that tail may
> not be that terrible). Lognormal distribution is a very cute model to
> play with and very dangerous in real work. In my work on Russian data,
> changing the assumptions about the top tail moved our Gini index from
> 0.48 to 0.60... and that's a little bit of a difference, let's put it
> this way.
>
> The recommendation you have heard probably concerns -intreg-, which
> you can read the help on.
>
> Imputing the mean income over a group will lead to a multitude of
> problems due to artificially compressed variability and values that
> are simply too low for the top group. If you desperately need to
> impute, you would want to go with multiple imputations (-help mi-),
> although you would want to read the MI manual and a paper
> (http://www.citeulike.org/user/ctacmo/article/8525275) or two
> (http://www.jstor.org/stable/2291635) if you are not familiar with the
> technique. What I have done in one of my projects recently was to
> generate the plausible values of the variable of interest a bunch of
> times (say, 50... the original suggestion to use 5 imputations dates
> back to late 1970s... and your smartphone now has more computing power
> than a then-Cray supercomputer) and make Stata believe they were
> imputed in Stata mi wide format.
>
> --
> -- Stas Kolenikov, PhD, PStat (SSC) :: http://stas.kolenikov.name
> -- Senior Survey Statistician, Abt SRBI :: work email kolenikovs at
> srbi dot com
> -- Opinions stated in this email are mine only, and do not reflect the
> position of my employer
>
>
> On Sat, Nov 17, 2012 at 6:12 AM, Tinna Asgeirsdottir
> <[email protected]> wrote:
>> Dear Stata users,
>>
>> In my data I have income in 13 groups. The top group is open ended. I
>> am trying to impute sensible values and would like to use this as a
>> continuous variable. I am especially concerned about the top category.
>> It has been suggested to me that I should use STATA´s ML command in
>> stead of using each categories mid-point. I am having trouble finding
>> what I need on the internet. Thus I wonder if anyone can tell me how
>> to fit a lognormal distribution to the variable and subsequently infer
>> the average income in the top bracket. If you know how to do this in
>> general for all the categories that is great as well as the
>> distributions over the other brackets is surely not uniform. However,
>> I think finding a good solution for my top category is the most
>> important thing though.
>>
>> Best regards,
>> Tinna
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/