Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: A correlation matrix after multiple imputation
From
[email protected] (Wesley D. Eddings, StataCorp)
To
[email protected]
Subject
Re: st: A correlation matrix after multiple imputation
Date
Mon, 26 Jul 2010 17:14:05 -0500
On Friday 23 July 2010, Alan Acock <[email protected]> asked if there is an easy
way to obtain a multiple-imputation estimate of a correlation matrix in Stata:
> Is there an easy way to obtain the 20 correlation matrices, one for each of
> the 20 imputed datasets and then somehow pulling these?
There is no automatic way of doing this, but with some programming effort,
Alan can use -mi estimate- to obtain an MI estimate of the correlation matrix;
see the code at the end of this post.
However, from a statistical standpoint, it is not clear whether averaging
completed-data sample estimates of the correlation matrix across imputed data
is the best approach to account for missing data when computing a correlation
matrix.
One alternative is to consider reporting an EM estimate of the covariance (or
correlation) matrix adjusted for missing data. Such an estimate can be
obtained from -mi impute mvn-, because -mi impute mvn- uses the EM algorithm
to get starting values of the parameters for the MCMC procedure. The EM
estimates of the coefficients and the variance-covariance matrix are saved
after -mi impute mvn- in the -r(Beta_em)- and -r(Sigma_em)- matrices,
respectively. To obtain EM estimates only, without producing imputations,
specify the -emonly- option with -mi impute mvn-; see the example below.
-- Wes -- Yulia
[email protected] [email protected]
=================== EXAMPLES ================================================
Here is how you can obtain an EM estimate of the correlation matrix accounting
for missing data:
/****************** begin do file ******************/
sysuse auto, clear
set seed 12345
replace mpg = . if runiform()>0.9
mi set wide
mi register imputed mpg weight
mi impute mvn mpg weight, emonly
mat Sigma = r(Sigma_em) /* save EM estimate of the variance-covariance (VC) matrix */
_getcovcorr Sigma, corr shape(full) /* convert VC to a correlation matrix */
mat C = r(C)
matlist C
/*************** end do file *****************/
Here is how you can obtain an MI estimate of the correlation matrix:
/***** begin MI correlation ******************/
cap program drop ecorr
program ecorr, eclass
version 11
syntax [varlist] [if] [in] [aw fw] [, * ]
if (`"`weight'"'!="") {
local wgt `weight'`exp'
}
marksample touse
correlate `varlist' `if' `in' `wgt', `options'
tempname b V
mata: st_matrix("`b'", vech(st_matrix("r(C)"))')
local p = colsof(`b')
mat `V' = J(`p',`p',0)
local cols: colnames `b'
mat rownames `V' = `cols'
eret post `b' `V' [`wgt'] , obs(`=r(N)') esample(`touse')
eret local cmd ecorr
eret local title "Lower-diagonal correlation matrix"
eret local vars "`varlist'"
end
cap program drop micorr
program micorr, rclass
tempname esthold
_estimates hold `esthold', nullok restore
qui mi estimate, cmdok: ecorr `0'
tempname C_mi
mata: st_matrix("`C_mi'", invvech(st_matrix("e(b_mi)")'))
mat colnames `C_mi' = `e(vars)'
mat rownames `C_mi' = `e(vars)'
di
di as txt "Multiple-imputation estimate of the correlation matrix"
di as txt "(obs=" string(e(N_mi),"%9.0g") ")"
matlist `C_mi'
return clear
ret matrix C_mi = `C_mi'
end
sysuse auto, clear
set seed 12345
replace mpg = . if runiform()>0.9
mi set wide
mi register imputed mpg weight
mi impute mvn mpg weight, add(20)
micorr mpg weight
/***** end MI correlation ********************/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/