Nick
Your imputation is correct. Thanks for the clarification.
Bill
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Nick Cox
Sent: Thursday, April 15, 2004 3:03 PM
To: [email protected]
Subject: st: RE: Imputing values for categorical data
impute the missing reference here. In this case, it
happens to be a book I know about.
(In other cases, in other postings, just giving
author surnames and dates makes the reference search
difficult: list members please note.)
Statistical Analysis With Missing Data, Second Edition
Roderick J. A. Little, Donald B. Rubin
ISBN: 0-471-18386-5 Wiley
September 2002
Nick
[email protected]
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On Behalf Of Dupont,
> William
> Sent: 15 April 2004 20:47
> To: [email protected]
> Subject: st: RE: Imputing values for categorical data
>
>
> Jennifer
>
> In my opinion, imputation makes the most sense when we wish to adjust
> for confounding variables. Suppose that I am primarily interested in
> the relationship between y and x, and I have complete data on these
> two variables from my data set. I feel, however, that I should adjust
> my analysis for a number of other confounding covariates and I know
> that missing values are scattered throughout these covariates. If I
> just regress y against x and these other covariates I get a complete
> case
> analysis: any record that is missing any value of these covariates is
> dropped from the analysis. This can lead to a substantial loss of
> power and has the potential to induce bias if having complete data
> is related
> to the response of interest. Suppose that one of my confounding
> variables is gender. If I have a number of records where y and x are
> known but gender is not, it does not seem sensible to throw out this
> information just because I would like to adjust my estimates
> for gender.
> If, however, I impute gender I can avoid loosing these data.
> As long as
> gender is only in the model as a confounder, I don't see that it does
> much harm to have an imputed value of say .2 for some patient, which
> means that based on her other covariates that she is 5 times
> more likely
> to be of one gender than the other.
>
> A tricky problem with imputation is that we often lack assurance that
> the missing values are missing at random. However, even in this
> situation, it is unclear that the complete case analysis is superior
> to an imputed analysis for the situation described above. Imputation
> becomes much more problematic when some variables of primary interest
> have missing values.
>
> The imputation gurus do not like the single conditional imputation
> provided by Stata (see for example Little and Rubin 2002). This is
> because this technique underestimates the standard error of the
> regression coefficient for covariates with imputed values and
> overestimates the degrees of freedom. Multiple imputation methods get
> around this problem and are fine as long as you are confident that the
> missing values are missing at random. If your are only using
> imputation for confounding variables I'm not convinced that it makes
> much difference how you do the imputation. However, multiple
> imputation is always theoretically preferable and can avoid hassles in
> the event that
> you come up against a referee who objects to all use of single
> conditional imputation.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/