Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Imputation of missing data in an unbalanced panel using ICE
From
Richard Williams <[email protected]>
To
[email protected], [email protected]
Subject
Re: st: Imputation of missing data in an unbalanced panel using ICE
Date
Fri, 25 Oct 2013 12:04:15 -0500
At 09:09 AM 10/25/2013, James Bernard wrote:
Thanks Antonis,
How about taking the average of the imputations for an observation.
Let's say we have 7 imputations (m=7). Then for a particular
obesrvation, we could take the average of the 7 imputed value?
Does this work?
When there is no clear cut statistical solution I personally am open
to improvisation. There are plenty of things where you don't need
accuracy to 12 decimal places. You just need to be in the ballpark.
So, you might try one imputation, a few imputations or all the
imputations. You might report, say, that the R^2 statistics or the
BIC statistics or whatever ranged between this and that. Another
possibility would be a diagnostic test and you run it on different
imputations and it always leads to the same conclusions. If you get
conflicting results or borderline results you have to worry more, but
if it is a clear cut decision no matter what you do then don't worry
about it too much.
Thanks
James
On Fri, Oct 25, 2013 at 9:41 PM, A Loumiotis
<[email protected]> wrote:
> I would first create a dummy that will be used to tell -ice- which
> values to impute:
>
> *****
> clear
> input str1 Firm Year X
> "A" 2000 .
> "A" 2001 10
> "A" 2002 6
> "A" 2003 .
>
> "B" 1998 3
> "B" 1999 .
> "B" 2000 .
> "B" 2001 4
> "B" 2002 6
> "B" 2003 2
> end
>
> replace X=.a if X==.
> reshape wide X, i(Firm) j(Year)
> foreach v of varlist X* {
> gen c`v'=`v'!=.
> replace `v'=0 if c`v'==0
> }
> ******
>
> I would then run -ice- using the -conditional()- option (you should
> fill in the remaining parts for the -ice- command:
> ice ..., conditional(X1998:cX1998==1, ...)
>
> I don't think it is a good idea to use only the results from the first
> imputation because your estimates will underestimate the true
> variance.
>
> Antonis
>
> On Fri, Oct 25, 2013 at 2:46 PM, James Bernard
<[email protected]> wrote:
>> Hi all,
>>
>> I have been using imputation techniques. Stata offers a wide range of
>> commands to conduct imputation.
>>
>> I have a unbalanced panel data. Several variables have missing values.
>> To benefit from the fact that the available observation of a variable
>> at certain times can help estimate the missing values at other times,
>> I changed the format of my data from long to wide and used ICE using
>> the instruction from this site:
>> http://www.ats.ucla.edu/stat/stata/faq/mi_longitudinal.htm
>>
>> These instructions work for a balanced panel data set where all firms
>> are supposed to have values in all years.
>>
>> But, imagine that one firm has to have values from 2000-2003, and
>> another from 1998-2003. And, suppose we have a variable (X) for which
>> some observations across these two firms are missing
>>
>> Firm Year X
>> --------- --------- -------
>> A 2000 .
>> A 2001 10
>> A 2002 6
>> A 2003 .
>>
>> B 1998 3
>> B 1999 .
>> B 2000 .
>> B 2001 4
>> B 2002 6
>> B 2003 2
>>
>> Reshaping the data from long to wide would lead to: creation of 6 new
>> varibale named "X1998", "X1999",......"X2003".... and values of X1998
>> and X1999 will be missing for firm A
>>
>> And running the ICE, it would predict values for X1998 and X1999 for
>> both firm A and B.
>>
>> The next step is to get the data into long form and run the -mi-
>> commands to make the estimation which use Rubin rules for combining
>> the data on the m imputations made.
>>
>> One may argue that I can let the ICE predict the values of X1998 and
>> X1999 for firm A. Reshape the data into long format and remove the
>> values of X from firm A in 1998 and in 1999, because firm A is not
>> supposed to have values in 1998 and 1999.
>>
>> My question is: Does asking ICE to predict values of X1998 and X1999
>> for firm A affect the way it predicts the value of X2000 (which is the
>> main observation we have to impute)?
>>
>> Does the technique I used make sense?
>>
>> Also, how wrong is to use only the first imputation (M=1) to run the
>> model, instead of using all the imputations?
>>
>> Thanks,
>> James
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME: (574)289-5227
EMAIL: [email protected]
WWW: http://www.nd.edu/~rwilliam
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/