Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Interactions and multiple-imputation
From
"Nic" <[email protected]>
To
<[email protected]>
Subject
st: Interactions and multiple-imputation
Date
Wed, 23 Mar 2011 20:34:31 -0400
Hello all,
In Alan C. Acock’s “A Gentle Introduction to Stata” (2010:367), it is
recommended to create interaction terms in the original dataset before
doing the multiple-imputation stage. That’s how I’ve proceeded thus far, but
I’m curious if I should in fact be doing so. I'll explain why below.
My survey dataset contains multiple measures of the same construct. For
example, 5 questions are used to measure the extent of childhood physical
abuse. In my non-multiply-imputed dataset I have created a single "physical
abuse" scale that is the average of the 5 component variables. I have a
small number of cases in which all 5 component variables are missing. I have
other cases in which the respondent answered some but not all of the 5
component questions. For these cases it seems as though I should be imputing
the missing values for the component variables and *then* creating the final
scale by averaging the complete sets of 5 questions. Otherwise, I will end
up with some cases in which the scale is completed but is based on averaging
less than the 5 component questions and will not receive the benefit of
imputation.
However, my interaction terms are the products of these types of scales
(like "physical abuse"). And as I mentioned at the beginning of this email,
the best advice according to Acock is to create interaction terms in the
original dataset and then impute the missing interaction terms.
So I cannot do it both ways. I can either:
1. Create my interaction terms in the original dataset based on component
variables which may themselves be comprised of missing values and then
impute the missing interaction terms.
or
2. Impute missing values in the original dataset with no scales or
interaction terms created. Then, with the multiply-imputed dataset, create
scales and then create interaction terms.
Option 2 seems to make more sense to me, but I thought it was a good idea to
post here before I defy the advice found in Acock's book. I also suspect
that the proper solution may be more complex than I realise.
With thanks,
Nic
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/