Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Imputing for missing proportions
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Imputing for missing proportions
Date
Fri, 12 Apr 2013 15:51:36 +0100
Good point. My comment was an easy shot, perhaps a cheap one.
But I think I never imply that ignoring missing data is really an
ideal solution, as I know it ignores the problem.
Nick
[email protected]
On 12 April 2013 15:44, Alan Acock <[email protected]> wrote:
> Nick is right that missing at random is a tough assumption, but it is easier than missing completely at random that is needed by listwise/case wise deletion.
> Alan Acock
>
> Sent from my iPad
>
> On Apr 12, 2013, at 3:49 AM, Nick Cox <[email protected]> wrote:
>
>> Well, imputation of missing values is vastly oversold any way. Missing
>> at random? I don't (usually) believe it. (Highly unofficial opinion.)
>> Nick
>> [email protected]
>>
>>
>> On 12 April 2013 11:44, Geomina Turlea <[email protected]> wrote:
>>> I know, but - mi impute- does not support glm either
>>>
>>> _________________________________________Geomina Turlea
>>> TODO AQUEL QUE SUEÑA SE CONVIERTE EN ARTISTA
>>>
>>>
>>> --- On Fri, 4/12/13, Nick Cox <[email protected]> wrote:
>>>
>>>> From: Nick Cox <[email protected]>
>>>> Subject: Re: st: Imputing for missing proportions
>>>> To: "[email protected]" <[email protected]>
>>>> Date: Friday, April 12, 2013, 1:35 PM
>>>> I haven't looked at whether it mixes
>>>> with -mi-, but -glm- with
>>>> -link(logit)- is a standard way to handle continuous
>>>> proportions.
>>>>
>>>> Nick
>>>> [email protected]
>>>>
>>>>
>>>> On 12 April 2013 11:08, Geomina Turlea <[email protected]>
>>>> wrote:
>>>>> Maarten,
>>>>> Thank you very much for your answer.
>>>>> The problem with -mi impute - is that it does not
>>>> really have an option for regressing proportions. I can't
>>>> really use truncated regression, and my dependent variable
>>>> is not binary or categorial, but a continous variable betwen
>>>> 0 and 1.
>>>>> I am considering to simulate the multiple imputation
>>>> with a beta regression for estimation of the missing
>>>> values.
>>>>> Very gratefull for an yes/no opinion on this,
>>>>> Geomina
>>>>>
>>>>>
>>>>> --- On Thu, 4/11/13, Maarten Buis <[email protected]>
>>>> wrote:
>>>>>
>>>>>> From: Maarten Buis <[email protected]>
>>>>
>>>> Geomina Turlea wrote:
>>>>
>>>>>>> I am fighting for a while with estimate
>>>> missing data
>>>>>> for the share of ICT professionals/total
>>>> employment, in 59
>>>>>> industries, 27 EU countries and for 14 years.
>>>>>>> This data exists in the European Labour Force
>>>> Survey,
>>>>>> but the dataset is incomplete.
>>>>>>>
>>>>>>> 1. Can I use mi impute with proportions?
>>>>>>> 2. I used betafit to fit a distribution with
>>>> values
>>>>>> between 0 and 1. Than I imputed the missing values
>>>> from the
>>>>>> estimated beta distribution. Is this method
>>>>>> superior/inferior to using mi impute?
>>>>>>> 3. I tried to use the Kolmogorov-Smirnov test,
>>>> but I
>>>>>> don't know what I got wrong. Below is a sequence
>>>> where I
>>>>>> created a variable with the distribution beta and
>>>> then test
>>>>>> the hypothesis with the K-S test. The test rejects
>>>> the null
>>>>>> hypothesis that the data has the distribution I
>>>> used to
>>>>>> create it. How could that be?
>>>>>>>
>>>>>>> . gen x=rbeta(0.05, 1.77)
>>>>>>> . ksmirnov x=rbeta(0.05, 1.77)
>>>>
>>>>>> My first step would be to look at the industries
>>>> with
>>>>>> missing values.
>>>>>> Sometimes missing just means 0 or negligable, and
>>>> looking at
>>>>>> the
>>>>>> industries would give you a fair guess of whether
>>>> that is
>>>>>> the case. If
>>>>>> that is the case your imputation problem reduces to
>>>> just a
>>>>>> recoding
>>>>>> problem.
>>>>>>
>>>>>> For questions 2 and 3: If you have an imputation
>>>> problem,
>>>>>> then you
>>>>>> should use -mi- and not -betafit- (available from
>>>> SSC),
>>>>>> because that
>>>>>> is what -mi- was designed for.
>>>>>>
>>>>>> For question 3: -rbeta()- gives you random numbers
>>>> from a
>>>>>> beta
>>>>>> distribution, so that is definately not something
>>>> you want
>>>>>> to feed in
>>>>>> -ksmirnov-. I just would use either -margdistfit-
>>>> or
>>>>>> -hangroot- (also
>>>>>> available from SSC) after -betafit- to check the
>>>> fit.
>>>> *
>>>> * For searches and help try:
>>>> * http://www.stata.com/help.cgi?search
>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> * http://www.ats.ucla.edu/stat/stata/
>>>
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/