Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Missing Observations. Do I need multiple Imputations?

From	Abekah Nkrumah <[email protected]>
To	[email protected]
Subject	Re: st: Missing Observations. Do I need multiple Imputations?
Date	Wed, 22 Aug 2012 08:32:38 +0100

Dear Antonis,

Thank you very much for your reply. I want to understand your first
line were you saying my aggregate variable is missing entirely? In my
statement I said the composite index (A) which you refereed to as
aggregate variable is there but drops substantial amount of
observations. So it is not entirely missing

Thanks very much

Regards

On Wed, Aug 22, 2012 at 7:44 AM, A Loumiotis
<[email protected]> wrote:
> Hi Gordon,
>
> Since your aggregate variable is missing when at least one component
> is missing I believe you would first need to multiple impute the
> missing observations of your dataset and then compute your aggregate
> variable.  I don't see a problem with multiple imputing variables such
> as age or number of wifes.  In addition, your results might change if
> your data are missing (conditionally) at random even if your non
> missing sample is large.
>
> Best,
> Antonis
>
>
>
> On Tue, Aug 21, 2012 at 7:18 PM, Abekah Nkrumah <[email protected]> wrote:
>> Dear Statalist,
>>
>>
>> I will want some advice on this rather long question. Variable A in
>> the table below is a composite index derived from the aggregation
>> variables B, C, D, E and F which are also sub-indices. A geometric
>> aggregation method was used. From the table I realise that the
>> observations on the composite index (A) drops significantly
>>
>>
>>  Variable |       Obs        Mean        Std. Dev.       Min        Max
>> -------------+--------------------------------------------------------
>> A                   69623    .4898275    .1575975   .0498657   .8980919
>> B                  187524     .524507    .2669241   1.80e-08          1
>> C                  221089    .6625131    .3732415   2.18e-08          1
>> D                 234680    .7486263    .3494941  -1.29e-08          1
>> E                 108437    .5253285    .0648927  -2.61e-08          1
>> -------------+--------------------------------------------------------
>> F                 119261    .6829314    .2270192  -1.62e-08          1
>>
>>
>> I then decided to do a missing data check for all the indices and the
>> results is below
>>
>>  Variable    |       Missing          Total     Percent Missing
>> ----------------+-----------------------------------------------
>> A                        166,075        235,698          70.46
>> B                        48,174        235,698          20.44
>> C                       14,609        235,698           6.20
>> D                       1,018           235,698           0.43
>> E                       127,261        235,698          53.99
>> F                        116,437        235,698          49.40
>> ----------------+-----------------------------------------------
>>
>>
>> I then checked the percentage missing for all the individual variables
>> used in computing the  the sub-indices especially B, C, E and F. The
>> results is as below
>>
>>
>>            Variable    |     Missing          Total     Percent Missing
>> ----------------+-----------------------------------------------
>>   B1 |      46,317        235,698          19.65
>>   B2 |      46,967        235,698          19.93
>>   B3 |      46,815        235,698          19.86
>>   B4 |        47,005        235,698          19.94
>>   C1 |       5,128        235,698           2.18
>>   C2 |        5,164        235,698           2.19
>>   C3 |       6,180        235,698           2.62
>>   C4 |       9,730        235,698           4.13
>>   C5 |       5,608        235,698           2.38
>>   D1 |         444        235,698           0.19
>>   D2 |         483        235,698           0.20
>>   D3 |         657        235,698           0.28
>>   E1 |      82,112        235,698          34.84
>>   E2 |      58,504        235,698          24.82
>>   E3 |      65,469        235,698          27.78
>>   E4|          81,349        235,698        34.51
>>   F1 |         214          235,698           0.09
>>   F2 |      63,503        235,698          26.94
>>   F3 |        86,512        235,698          36.70
>>   F4 |         674        235,698           0.29
>> ----------------+-----------------------------------------------
>>
>> The results above suggest that the drop in the number of observations
>> for the composite empowerment variable is due to the high level of
>> missing values in the four sub-indices (B, C, E and F) as also
>> supported by the high level of missing values in the variables used in
>> computing those indices.
>>
>> I was therefore wondering whether an explanation like this in the
>> appendix of my work will be fine or I will need to do multiple
>> imputing to replace the missing data.
>>
>> I have thought through this and the question am asking myself is that
>> if have to do multiple imputation, the variables to for the imputation
>> exercise will be the B variables (these are decision-making
>> variables), then the E variables (these are number of wives, age at
>> first marriage, women's age, partners age) and then F3 and F4 (which
>> are partner's education and whether a woman earns cash).
>>
>> My worry is whether it will be sensible to impute variables such as
>> age and number of wives? Secondly considering that I still have a
>> large sample size to work with, y guess is that the results from the
>> remaining sample will not change that much. Thus am wandering whether
>> it will still be  necessary to impute the missing data
>>
>> I will appreciate to hear from you on this so Will know which way to
>> go. Thank you very much.
>>
>> Regards
>>
>> Gordon
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/



-- 
**********************************************
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Missing Observations. Do I need multiple Imputations?
  - From: A Loumiotis <[email protected]>

References:
- st: Missing Observations. Do I need multiple Imputations?
  - From: Abekah Nkrumah <[email protected]>
- Re: st: Missing Observations. Do I need multiple Imputations?
  - From: A Loumiotis <[email protected]>

Prev by Date: Re: st: survival analysis
Next by Date: Re: Re: st: Out-of-sample forecasting using OLS regression
Previous by thread: Re: st: Missing Observations. Do I need multiple Imputations?
Next by thread: Re: st: Missing Observations. Do I need multiple Imputations?
Index(es):
- Date
- Thread