Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Interactions and multiple-imputation


From   "Nic" <[email protected]>
To   <[email protected]>
Subject   st: Interactions and multiple-imputation
Date   Wed, 23 Mar 2011 20:34:31 -0400

Hello all,

In Alan C. Acock’s “A Gentle Introduction to Stata” (2010:367), it is recommended to create interaction terms in the original dataset before doing the multiple-imputation stage. That’s how I’ve proceeded thus far, but I’m curious if I should in fact be doing so. I'll explain why below.

My survey dataset contains multiple measures of the same construct. For example, 5 questions are used to measure the extent of childhood physical abuse. In my non-multiply-imputed dataset I have created a single "physical abuse" scale that is the average of the 5 component variables. I have a small number of cases in which all 5 component variables are missing. I have other cases in which the respondent answered some but not all of the 5 component questions. For these cases it seems as though I should be imputing the missing values for the component variables and *then* creating the final scale by averaging the complete sets of 5 questions. Otherwise, I will end up with some cases in which the scale is completed but is based on averaging less than the 5 component questions and will not receive the benefit of imputation.

However, my interaction terms are the products of these types of scales (like "physical abuse"). And as I mentioned at the beginning of this email, the best advice according to Acock is to create interaction terms in the original dataset and then impute the missing interaction terms.

So I cannot do it both ways. I can either:
1. Create my interaction terms in the original dataset based on component variables which may themselves be comprised of missing values and then impute the missing interaction terms.
or
2. Impute missing values in the original dataset with no scales or interaction terms created. Then, with the multiply-imputed dataset, create scales and then create interaction terms.

Option 2 seems to make more sense to me, but I thought it was a good idea to post here before I defy the advice found in Acock's book. I also suspect that the proper solution may be more complex than I realise.

With thanks,
Nic
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index