Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: generating splines in variable with missing data and multiple imputation
From
Maarten Buis <[email protected]>
To
[email protected]
Subject
Re: st: generating splines in variable with missing data and multiple imputation
Date
Tue, 26 Feb 2013 09:41:27 +0100
The problem is not the standard error, but your effect: you have
imputed the values for pkyr assuming a linear effect, so the effect
you will get out of your final model will be biased towards a linear
effect. Passive imputation is technique that sounds plausible but is
controversial. The alternative that is often proposed is to create
your splines (and interactions, and polynomials, and ...) before
imputing the data and treat them as just another variable to be
imputed. Some say that this actually performs better than passive
imputation (Graham 2009, von Hippel 2009), and has the additional
advantage of being easy to implement.
Hope this helps,
Maarten
John Graham (2009) "Missing Data Analysis: Making it Work in the Real
World", Annual Review of Psychology, 60:549-576.
Paul von Hippel (2009) "How to impute interactions, squares, and other
transformed variables", Sociological Methodology, 39:265-291.
On Mon, Feb 25, 2013 at 9:52 PM, Deppen, Steve
<[email protected]> wrote:
> I'm using Stata v12 and I have a small (492) dataset with missing data. One of the variables, pack-years has a non-linear relationship to the outcome of cancer. Pack-years is best modeled, given my limited degrees of freedom for other variables of interest, as a restricted cubic spline with 3 knots. I'm missing data within pack years. I can run:
>
> mkspline pkyr = pack_years, cubic nknots(3)
>
> after I generate my 20 imputed datasets. However, I believe that my confidence interval may be incorrect. I know in R, that variance inflation due to imputing the nonlinear variable is maintained using aregImpute and subsequent fit.mult.impute. I afraid my standard errors are too small since I estimated the splines outside the imputation. Is there a way to generate splines as a passive variable within the multiple imputation?
>
> Thank you,
>
>
> Stephen Deppen MA MS
> Department of Thoracic Surgery
> Institute for Medicine and Public Health
> Vanderbilt University Medical Center
> (ph) 615-343-6284
> (fax) 615 936-3007
>
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
--
---------------------------------
Maarten L. Buis
WZB
Reichpietschufer 50
10785 Berlin
Germany
http://www.maartenbuis.nl
---------------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/