Maren Kandulla wrote (excerpted):
I do have one request regarding your remark:
> With 75 people and four groups, you will have unequal representations
among
> groups. I think that MANOVA does best with equal representation among
> groups, just like factorial ANOVA does. At the least, you might need to
> adjust the one-quarter to some group-size-weighted fraction in order to
> get
> the -lincom- estimate to match that by -summarize-.
I have a very unbalanced design. To be more precise than in my previous
email where I "combined information", I have following n-distribution: 1.
Cohort with 4 groups: 21, 14, 35, 17; total 87 and 2. Cohort with 3 groups:
35, 21, 19; total 75; cohorts are analysed seperately.
In the Stata-Manual I found following information:
> manova fits multivariate analysis-of-variance (MANOVA) and multivariate
analysis-of-covariance (MANCOVA) models for balanced and unbalanced
designs...
I therefore decided not to do any adjustment. Please, correct me if this was
wrong!
--------------------------------------------------------------------------------
The remark was based on an analogy to weighted and unweighted means analysis
for unbalanced factorial ANOVA. There is a controversy over the use of
so-called SAS Type I and Type III SS. Among the reasons for preferring the
former, it is said to have higher power than the latter does in the absence
of an interaction, although I recall that the difference was not major in
simulations with -anova , sequential- and -anova , partial- done some time
ago.
A re-reading this morning of the relevant passages in R. A. Johnson & D. W.
Wichern, _Applied Multivariate Statistical Analysis_ 4th Ed., (Englewood
Cliffs, New Jersey: Prentice-Hall, 1998) doesn't suggest that unequal group
size is a problem with MANOVA if you have only a single grouping factor.
If you're ever in doubt, you can always try -manovatest- with and without
adjustment to see whether it makes any practical difference in the test for
change across time. The simulation below is of a repeated-measures design
that replicates group sizes of your Cohort 1. I've given it a between-group
difference of one-half standard deviation in one group, and a linear trend
across time of one-half standard deviation for all groups, i.e., parallel
time-course (additive, no interaction). The covariance structure is
intended to mimic something that you might see with a rather long-duration
between observations, probably more extreme than what would be encountered
in the typical short-term experimental study in the fields of biology,
psychology, or medicine.
There is a difference in power as expected. When you do not adjust the
elements of the test matrix for -manovatest- according to the respective
group sizes, the power is 66%. When you do the adjustment, the power rises
modestly to 73%. There is analogously a slight rise in power for main
effects of time in repeated-measures ANOVA (using the Huynh-Feldt epsilon).
The important difference, however, is between MANOVA and repeated-measures
ANOVA with the Huynh-Feldt degrees of freedom adjustment. The latter is
more powerful with your sample sizes, at least for main effects. I've
copied the results into a table below, because the simulation takes a while
to run, and the do-file might be wrapped during e-mail processing.
--------------------------------------
Percent
null hypothesis
Term rejection
--------------------------------------
Main effects of group
MANOVA 29
ANOVA 60
Main effects of time
MANOVA (with adj.) 73
MANOVA (no adj.) 66
ANOVA (sequential) 87
ANOVA (partial) 84
Group X Time
MANOVA 4.7
ANOVA 5.5
--------------------------------------
You can re-run this ado-file substituting zeroes for the mean vector
for -drawnorm- to see how well the Type I error rate is controlled by the
Huynh-Feldt adjustment for the main effects of time with this covariance
structure. The null hypothesis was true for the interaction of group and
time in the do-file below, so the values in the table above give the Type I
error rates for that term.
Joseph Coveney
clear
set more off
set seed `=date("2005-11-18", "ymd")'
set obs 6
forvalues i = 1/6 {
generate float a`i' = (`i' == _n) * 1 + ///
(`i' != _n) * 0.5^abs(`i' - _n)
local responses `responses' response`i'
}
mkmat a*, matrix(A)
*
capture program drop runem
program define runem, rclass
syntax , responses(namelist) corr(name)
tempname S M sequential partial
drawnorm `responses', means(0.0 0.1 0.2 0.3 0.4 0.5) ///
corr(`corr') n(87) clear
generate byte treatment = 0
foreach size in 21 14 35 {
local group = `group' + `size'
replace treatment = treatment + 1 if _n > `group'
}
macro drop _group
forvalues i = 1/6 {
replace response`i' = response`i' + 0.5 ///
if treatment == 1
}
generate byte row = _n
manova `responses' = treatment
matrix `S' = e(stat_m)
return scalar mmain = `S'[1,5]
matrix `M' = (1, -1, 0, 0, 0, 0 \ ///
0, 1, -1, 0, 0, 0 \ ///
0, 0, 1, -1, 0, 0 \ ///
0, 0, 0, 1, -1, 0 \ ///
0, 0, 0, 0, 1, -1)
matrix `sequential' = (1, `=21/87', `=14/87', `=35/87', `=17/87')
matrix `partial' = (1, 0.25, 0.25, 0.25, 0.25)
manovatest , test(`sequential') ytransform(`M')
matrix `S' = r(stat)
return scalar mstime = `S'[1,5]
manovatest , test(`partial') ytransform(`M')
matrix `S' = r(stat)
return scalar mptime = `S'[1,5]
manovatest treatment, ytransform(`M')
matrix `S' = r(stat)
return scalar minter = `S'[1,5]
quietly reshape long response, i(row) j(time)
anova response treatment / row | treatment time treatment*time, ///
category(treatment time row) repeated(time) sequential
return scalar amain = Ftail(e(df_1), e(dfdenom_1), e(F_1))
return scalar astime = Ftail(`=e(hf1) * e(df_3)', ///
`=e(hf1) * e(df_r)', e(F_3))
return scalar ainter = Ftail(`=e(hf1) * e(df_4)', ///
`=e(hf1) * e(df_r)', e(F_4))
anova response treatment / row | treatment time treatment*time, ///
category(treatment time row) repeated(time)
return scalar aptime = Ftail(`=e(hf1) * e(df_3)', ///
`=e(hf1) * e(df_r)', e(F_3))
end
*
simulate mmain = r(mmain) mstime = r(mstime) mptime = r(mptime) ///
minter = r(minter) amain = r(amain) astime = r(astime) ///
aptime = r(aptime) ainter = r(ainter), ///
reps(3000) nodots: runem , responses(`responses') corr(A)
foreach var of varlist _all {
generate byte p_`var' = `var' < 0.05
}
summarize p_mmain p_amain
summarize p_mstime p_mptime p_astime p_aptime
summarize p_minter p_ainter
exit
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/