stepwise regression is needed. Say we have n = 200, and a potential pool
of predictors = 50, say that each of these 50 predictors have 1 or 2
missing, not necesarily randomly. Using the Stata stepwise procedure, we
may well end up with a final model with some 5 variables, but this model
was derived only using around 75% of the sample, and most likely not a
random sample. Would it not be wiser to use all available observations at
each try? Intuitively I feel that this final model might be less biased
because it does not involve throwing as much information away (1% vs 25%),
although I believe mathematically this would be quite difficult to prove.
One of the concerns with stepwise is that a different sample could
easily lead to different variables being selected. That concern
would seem to be even greater with a small sample, where the
estimates are going to be less precise, i.e. two different samples of
200 could easily lead to two different sets of variables being
selected, especially if a lot of variables are close to each other in
their correlations.