Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: A correlation matrix after multiple imputation

From	[email protected] (Wesley D. Eddings, StataCorp)
To	[email protected]
Subject	Re: st: A correlation matrix after multiple imputation
Date	Mon, 26 Jul 2010 17:14:05 -0500

On Friday 23 July 2010, Alan Acock <[email protected]> asked if there is an easy
way to obtain a multiple-imputation estimate of a correlation matrix in Stata:

> Is there an easy way to obtain the 20 correlation matrices, one for each of
> the 20 imputed datasets and then somehow pulling these?

There is no automatic way of doing this, but with some programming effort,
Alan can use -mi estimate- to obtain an MI estimate of the correlation matrix;
see the code at the end of this post.

However, from a statistical standpoint, it is not clear whether averaging
completed-data sample estimates of the correlation matrix across imputed data
is the best approach to account for missing data when computing a correlation
matrix.

One alternative is to consider reporting an EM estimate of the covariance (or
correlation) matrix adjusted for missing data.  Such an estimate can be
obtained from -mi impute mvn-, because -mi impute mvn- uses the EM algorithm
to get starting values of the parameters for the MCMC procedure.  The EM
estimates of the coefficients and the variance-covariance matrix are saved
after -mi impute mvn- in the -r(Beta_em)- and -r(Sigma_em)- matrices,
respectively.  To obtain EM estimates only, without producing imputations,
specify the -emonly- option with -mi impute mvn-; see the example below.


-- Wes                  -- Yulia
[email protected]      [email protected]


=================== EXAMPLES ================================================

Here is how you can obtain an EM estimate of the correlation matrix accounting
for missing data:

/****************** begin do file ******************/
sysuse auto, clear
set seed 12345
replace mpg = . if runiform()>0.9
mi set wide
mi register imputed mpg weight
mi impute mvn mpg weight, emonly
mat Sigma = r(Sigma_em) /* save EM estimate of the variance-covariance (VC) matrix */
_getcovcorr Sigma, corr shape(full)   /* convert VC to a correlation matrix */
mat C = r(C)
matlist C
/*************** end do file *****************/

Here is how you can obtain an MI estimate of the correlation matrix:

/***** begin MI correlation ******************/
cap program drop ecorr
program ecorr, eclass
	version 11
	syntax [varlist] [if] [in] [aw fw] [, * ]
	if (`"`weight'"'!="") {
		local wgt `weight'`exp'
	}
	marksample touse
	correlate `varlist' `if' `in' `wgt', `options'
	tempname b V
	mata: st_matrix("`b'", vech(st_matrix("r(C)"))')
	local p = colsof(`b')
	mat `V' = J(`p',`p',0)
	local cols: colnames `b'
	mat rownames `V' = `cols'
	eret post `b' `V' [`wgt'] , obs(`=r(N)') esample(`touse')
	eret local cmd ecorr
	eret local title "Lower-diagonal correlation matrix"
	eret local vars "`varlist'"

end

cap program drop micorr
program micorr, rclass
	tempname esthold
	_estimates hold `esthold', nullok restore
	qui mi estimate, cmdok: ecorr `0'
	tempname C_mi
	mata: st_matrix("`C_mi'", invvech(st_matrix("e(b_mi)")'))
	mat colnames `C_mi' = `e(vars)'
	mat rownames `C_mi' = `e(vars)'
	di
	di as txt "Multiple-imputation estimate of the correlation matrix"	
	di as txt "(obs=" string(e(N_mi),"%9.0g") ")"
	matlist `C_mi'
	return clear
	ret matrix C_mi = `C_mi'
end

sysuse auto, clear
set seed 12345
replace mpg = . if runiform()>0.9
mi set wide
mi register imputed mpg weight
mi impute mvn mpg weight, add(20)
micorr mpg weight
/***** end MI correlation ********************/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: RE: xt-overid
Next by Date: st: question about working with dates and times
Previous by thread: st: calculate the exact Poisson confidence intervals in stata
Next by thread: st: question about working with dates and times
Index(es):
- Date
- Thread