You put it very well. I don't know the reason
for the original limit of 31, but I suspect it now
longer bites with modern hardware and software.
If you have enough data points and can handle the matrix
inversion presumably you can push it much higher,
at least from the point of view of getting some
numbers to put in the gaps.
The real issues to me are nevertheless -- and
this is a rant aimed at no-one in particular,
and certainly not Renzo --
* The performance as the number of predictors
increases. What happens not just to R-sq and
kin but to standard errors etc.? If you can't do
something worthwhile with 31 (carefully
chosen) predictors, it is difficult to believe
that 63 or 127 really would produce a much better
solution. (If they do, you chose the wrong 31.)
* The more general pluses and minuses for what
you are doing, including reproducibility,
avoidance of circular argument and even
avoidance of self-delusion.
Nick
[email protected]
Renzo Comolli
>
> I know this behavior is strictly "at my own risk". Anyway I
> (copied with a
> different name and) removed the limitation to 31 variables
> in the impute.ado
> It works with no waiting time at all even with 52 variables.
> I wonder whether StataCorp has been too risk averse when
> they now updated it
> from version 3.1 to version 8 of the ado.
>
> Anybody had similar experiences of removing the limitation?
> >From the explanation in the manual of what -impute- does,
> it is possible
> that I could get away with so many variables because almost
> all of them
> where dummies and therefore easy to order. (counting the categorical
> variables before the dummy expansion I am way below 15)
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/