Nick Cox wrote:
>This was asked a while back. I can't remember
>the answer but it should be in the archives.
>
Thanks (and to Richard Williams and Kit Baum) - I found the mail and now
understand the problem.
>A quite different comment is to ask how much
>better your imputation is based on 30+ predictors
>rather than 30.
>
The situation is that I have 17 biochemical variables, each measured twice (call
them A1, A2, ..., Q1, Q2). Each of these need to be transformed into Xsum=X1+X2
and Xdiff=abs(X1-X2) - these will then be used in a score test.
Sometimes the biological assays fail. And so my missing values problem is
doubled by this transformation (if X1 is missing, but not X2, both Xsum and
Xdiff will be missing). So I wanted to -impute- the missing values for each
Xsum and Xdiff, conditional on all A1-Q2.
Quite possibly the imputation would be not much different if I used 15 rather
than 17 predictor variables; I will investigate using regress. But as all
these biochemical measures relate to the same biochemical pathway, I am not
clear how to select "the best" 15 out of 17.
C.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/