Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: running tab and ttests with multimple imputation
From
[email protected] (Wesley D. Eddings, StataCorp)
To
[email protected]
Subject
Re: st: running tab and ttests with multimple imputation
Date
Thu, 04 Nov 2010 17:10:30 -0500
Fernando Andrade <[email protected]> asked about computing Pearson
chi-square statistics and t-tests from multiply imputed data:
> i looked at the estimation options and seems
> there is not an option to run a chi2 for a contingency table.
> is there another option or do i have to compute it by hand? is there
> also an option top run t-test for multiple imputed data?
Fernando is correct that the -mi estimate- command does not support -tabulate-
or -ttest-: multiple imputation is better suited to statistical models than to
statistical tests per se. One alternative, as Fernando suggested, is to report
the model F statistic after fitting a model with the -mi estimate- prefix and
either -logit-, -mlogit-, or -regress-. Fernando may use the i. factor-variable
prefix to include indicator variables for his categorical predictors.
Suppose we'd performed the following -ttest-:
. sysuse auto, clear
(1978 Automobile Data)
.
. ttest mpg, by(foreign)
Two-sample t test with equal variances
------------------------------------------------------------------------------
Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
Domestic | 52 19.82692 .657777 4.743297 18.50638 21.14747
Foreign | 22 24.77273 1.40951 6.611187 21.84149 27.70396
---------+--------------------------------------------------------------------
combined | 74 21.2973 .6725511 5.785503 19.9569 22.63769
---------+--------------------------------------------------------------------
diff | -4.945804 1.362162 -7.661225 -2.230384
------------------------------------------------------------------------------
diff = mean(Domestic) - mean(Foreign) t = -3.6308
Ho: diff = 0 degrees of freedom = 72
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(T < t) = 0.0003 Pr(|T| > |t|) = 0.0005 Pr(T > t) = 0.9997
Now we'll create some missing values, declare our -mi- settings, impute the
missing values, and analyze the data with -mi estimate: regress-. (The
imputation model is naive and is used only for illustration.)
. set seed 12345
.
. replace mpg = . in 1/10
(10 real changes made, 10 to missing)
.
. mi set wide
.
. mi register imputed mpg
.
. mi impute regress mpg foreign weight length price, add(20)
Univariate imputation Imputations = 20
Linear regression added = 20
Imputed: m=1 through m=20 updated = 0
| Observations per m
|----------------------------------------------
Variable | complete incomplete imputed | total
---------------+-----------------------------------+----------
mpg | 64 10 10 | 74
--------------------------------------------------------------
(complete + incomplete = total; imputed is the minimum across m
of the number of filled in observations.)
.
. mi estimate, dftable: regress mpg i0.foreign
Multiple-imputation estimates Imputations = 20
Linear regression Number of obs = 74
Average RVI = 0.0497
Complete DF = 72
DF adjustment: Small sample DF: min = 67.87
avg = 68.97
max = 70.08
Model F test: Equal FMI F( 1, 67.9) = 11.06
Within VCE type: OLS Prob > F = 0.0014
------------------------------------------------------------------------------
| % Increase
mpg | Coef. Std. Err. t P>|t| DF Std. Err.
-------------+----------------------------------------------------------------
0.foreign | -4.811563 1.447082 -3.33 0.001 67.9 1.47
_cons | 24.77273 1.195503 20.72 0.000 70.1 0.00
------------------------------------------------------------------------------