Sorry listers,
This supposed to be a private email, and was sent to
the list by mistake. Sorry!
Weihua Guan
--- Weihua Guan <[email protected]> wrote:
> Hi Jeff,
>
> How's going? I happen to read this post. It seems
> the
> method -impute- uses is out-of-date, and may give
> invalid inferences from the imputed data. Does
> Stata
> has a plan to implement multiple imputation?
>
> Weihua
>
> --- "Jeff Pitblado, StataCorp LP"
> <[email protected]> wrote:
> >
> > Renzo Comolli <[email protected]> asks about
> > the limit on the number of
> > variables allowed by -impute-:
> >
> > > I know this behavior is strictly "at my own
> risk".
> > Anyway I (copied with a
> > > different name and) removed the limitation to 31
> > variables in the impute.ado
> > > It works with no waiting time at all even with
> 52
> > variables.
> > > I wonder whether StataCorp has been too risk
> > averse when they now updated it
> > > from version 3.1 to version 8 of the ado.
> >
> > > Anybody had similar experiences of removing the
> > limitation?
> > > From the explanation in the manual of what
> > -impute- does, it is possible
> > > that I could get away with so many variables
> > because almost all of them
> > > where dummies and therefore easy to order.
> > (counting the categorical
> > > variables before the dummy expansion I am way
> > below 15)
> >
> > The -impute- command runs regressions by
> best-subset
> > regression, looking at
> > the pattern of missing values in the predictors.
> It
> > is conceivable that
> > -impute- must run a regression for each
> combinations
> > of the predictor
> > variables, depending upon the patter of
> missingness.
> >
> > In order to enumerate all best-subset
> combinations,
> > -impute- looks at the 0's
> > and 1's in the binary representation of a long
> > integer. In Stata, a long
> > integer contains 32 bits--one of which is used for
> > the sign. Thus each of the
> > remaining bits are used to identify whether to
> > include a predictor variable in
> > a given regression, and increasing this limit
> beyond
> > 31 will not have a
> > desirable result (even thought the modified
> -impute-
> > will not exit with an
> > error).
> >
> > To illustrate how -impute- determines which
> > variables to include in a
> > regression, suppose there are 3 predictors and
> that
> > the pattern of missing
> > values among them requires a regression for each
> > combination. In this--albeit
> > worst case scenario--there are 2^3 = 8 regressions
> > to run. We can determine
> > which predictor to include in a regression by
> > looking at the binary
> > representation of the regression index (starting
> > from 0):
> >
> > integer (base 10) integer
> > (binary)
> > 0 000
> > 1 001
> > 2 010
> > 3 011
> > 4 100
> > 5 101
> > 6 110
> > 7 111
> >
> > If the names of the predictor variables are x1 x2
> > and x3, we can interpret the
> > binary number like this
> >
> > x3 x2 x1
>
> >
> -------------------------------------
> > <digit> <digit>
> <digit>
> >
> > Thus 001 mean include x1, 011 means include x1 and
> > x2, ...
> >
> > Given this implementation, there has to be a limit
> > on how many predictors are
> > allowed by the -impute- command before the
> generated
> > -long integer- variable
> > becomes automatically -recast- to a -float- or
> > -double-, thus breaking the
> > implementation.
> >
> > By increasing the limit, all variables beyond the
> > first 31 (possibly fewer)
> > will not be used in any of the regressions.
> >
> > One way to get around this limit would be to add
> an
> > option to -impute-, say
> > -nomissings()-, that will take a varlist. These
> > variables will be assumed
> > missing-value-free so that they could be present
> in
> > all regressions.
> >
> > We will look into adding this as a future update.
> >
> > --Jeff
> > [email protected]
> > *
> > * For searches and help try:
> > *
> > http://www.stata.com/support/faqs/res/findit.html
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
>
>
> __________________________________
> Do you Yahoo!?
> Protect your identity with Yahoo! Mail AddressGuard
> http://antispam.yahoo.com/whatsnewfree
> *
> * For searches and help try:
> *
> http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
> *
> * For searches and help try:
> *
> http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
__________________________________
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/