Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <njcoxstata@gmail.com> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: Imputing for missing proportions |
Date | Fri, 12 Apr 2013 11:35:35 +0100 |
I haven't looked at whether it mixes with -mi-, but -glm- with -link(logit)- is a standard way to handle continuous proportions. Nick njcoxstata@gmail.com On 12 April 2013 11:08, Geomina Turlea <geomina@yahoo.fr> wrote: > Maarten, > Thank you very much for your answer. > The problem with -mi impute - is that it does not really have an option for regressing proportions. I can't really use truncated regression, and my dependent variable is not binary or categorial, but a continous variable betwen 0 and 1. > I am considering to simulate the multiple imputation with a beta regression for estimation of the missing values. > Very gratefull for an yes/no opinion on this, > Geomina > > > --- On Thu, 4/11/13, Maarten Buis <maartenlbuis@gmail.com> wrote: > >> From: Maarten Buis <maartenlbuis@gmail.com> Geomina Turlea wrote: >> > I am fighting for a while with estimate missing data >> for the share of ICT professionals/total employment, in 59 >> industries, 27 EU countries and for 14 years. >> > This data exists in the European Labour Force Survey, >> but the dataset is incomplete. >> > >> > 1. Can I use mi impute with proportions? >> > 2. I used betafit to fit a distribution with values >> between 0 and 1. Than I imputed the missing values from the >> estimated beta distribution. Is this method >> superior/inferior to using mi impute? >> > 3. I tried to use the Kolmogorov-Smirnov test, but I >> don't know what I got wrong. Below is a sequence where I >> created a variable with the distribution beta and then test >> the hypothesis with the K-S test. The test rejects the null >> hypothesis that the data has the distribution I used to >> create it. How could that be? >> > >> > . gen x=rbeta(0.05, 1.77) >> > . ksmirnov x=rbeta(0.05, 1.77) >> My first step would be to look at the industries with >> missing values. >> Sometimes missing just means 0 or negligable, and looking at >> the >> industries would give you a fair guess of whether that is >> the case. If >> that is the case your imputation problem reduces to just a >> recoding >> problem. >> >> For questions 2 and 3: If you have an imputation problem, >> then you >> should use -mi- and not -betafit- (available from SSC), >> because that >> is what -mi- was designed for. >> >> For question 3: -rbeta()- gives you random numbers from a >> beta >> distribution, so that is definately not something you want >> to feed in >> -ksmirnov-. I just would use either -margdistfit- or >> -hangroot- (also >> available from SSC) after -betafit- to check the fit. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/