Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: variance components from random-effects ANOVA?


From   "Yulia Marchenko" <[email protected]>
To   <[email protected]>
Subject   st: RE: variance components from random-effects ANOVA?
Date   Mon, 26 Jul 2004 17:44:51 -0500

On Saturday, Alex wrote:

>I'm looking for a way to get estimates of variance components out of a
random effects ANOVA. While I
>understand that the -loneway- command includes variance component estimates
in its output for one-way random-
>effects ANOVA, I'd like to know whether these estimates can also be
obtained for two-way and higher-order ANOVA
>with random effects? I have looked all over the place, the manual, net
resources incl. the statalist archives,
>but to no avail. Is it the case that I have to compute the variance
component estimates manually from the Mean
>Squares?

Here are a couple of examples of how to calculate variance components in
using different types of designs. Unfortunately, we do have to compute the
variance components ourselves.

Before proceeding with the examples, note that in the 18may2004 update all
sum of squares and corresponding to them degrees of freedom are in the saved
results after -anova-. Therefore, there is no need to run -test-
after -anova- to get corresponding mean squared errors as we had to do
previously.

The examples come from the following two books:
http://www.stata.com/bookstore/doe.html
http://www.stata.com/bookstore/sped.html


1)	Random effect with one-way factorial design:
You can use STATA command -loneway- to get variance components estimates
(example in [R] loneway p.338) for both balanced and unbalanced cases.

2)	Random nested factors.
Data is from Kuehl, "Design of Experiments", 2nd edition, p.243, example 7.3

/**********random nested factors******/
clear
input day run glu
1 1 42.5
1 1 43.3
1 1 42.9
1 2 42.2
1 2 41.4
1 2 41.8
2 3 48.0
2 3 44.6
2 3 43.7
2 4 42.0
2 4 42.8
2 4 42.8
3 5 41.7
3 5 43.4
3 5 42.5
3 6 40.6
3 6 41.8
3 6 41.8
end

anova glu day / run|day/
di "variance components:"
quie{
local r 3
local b_run 2
scalar reps=  e(rss)/ e(df_r)
scalar runs=  (e(ss_2)/e(df_2)- e(rss)/e(df_r))/`r'
scalar days=  (e(ss_1)/e(df_1)- e(ss_2)/e(df_2))/(`r'*`b_run')
}
di "sigma^2 reps = " reps
di "sigma^2 run = "  runs
di "sigma^2 day = "  days
scalar total = reps + runs + days
di "sigma^2 total = "  total

/*****************************using matrices******************************/
quie{
mat b = (e(ss_1)/e(df_1) , e(ss_2)/e(df_2), e(rss)/e(df_r))
mat C = (1,`r',`r'*`b_run' \1,`r',0\1,0,0)
mat v = inv(C)*b'
mat v = v \ trace(diag(v))
matrix colnames v = sigma^2
matrix rownames v = rep/run run/day day total
}
mat list v
exit

Estimates of variance components are obtained by equating the expected
values of the mean squares to their expectations with replacement of
parameters by their estimators. This results in system of linear equations.
We can solve this system of equations rather than relying on formulas. That
is what I have done above in the part named "using matrices".

3)	Two-way random effects model (balanced case).
Data is from Kuehl, "Design of Experiments", 2nd edition, p.232, example 7.1

/***********two-way random balanced*********************/
clear
input day machine trigly
1	1	142.3
1	1	144
1	2	148.6
1	2	146.9
1	3	142.9
1	3	147.4
1	4	133.8
1	4	133.2
2	1	134.9
2	1	146.3
2	2	145.2
2	2	146.3
2	3	125.9
2	3	127.6
2	4	108.9
2	4	107.5
3	1	148.6
3	1	156.5
3	2	148.6
3	2	153.1
3	3	135.5
3	3	138.9
3	4	132.1
3	4	149.7
4	1	152
4	1	151.4
4	2	149.7
4	2	152
4	3	142.9
4	3	142.3
4	4	141.7
4	4	141.2
end
anova trigly day machine day*machine

di "variance components:"
quie{
local a_day 4
local b_mach 4
local r 2
local reps e(rss)/e(df_r)
local dayXmach (e(ss_3)/e(df_3) - `reps')/`r'
local mach  (e(ss_2)/e(df_2) - e(ss_3)/e(df_3))/(`r'*`a_day')
local day  (e(ss_1)/e(df_1) - e(ss_3)/e(df_3))/(`r'*`b_mach')
local total = `day'+`mach'+`dayXmach' +`reps'
}
di "sigma^2 reps = " `reps'
di "sigma^2 dayXmachine = "  `dayXmach'
di "sigma^2 machine = "  `mach'
di "sigma^2 day = "  `day'
di "sigma^2 total = "  `total'

/*****************************using matrices******************************/
quie{
mat b = (e(ss_1)/e(df_1) , e(ss_2)/e(df_2), e(ss_3)/e(df_3), e(rss)/e(df_r))
mat C = (1,`r',0,`r'*`b_mach'\1,`r',`r'*`a_day',0 \1,`r',0,0\1,0,0,0)
mat v = inv(C)*b
/****calculates total variation and adds to the last row of matrix v****/'
mat v = v \ trace(diag(v))
matrix colnames v = sigma^2
matrix rownames v = error dayXmachine machine day total
}
mat list v
exit


4)	Two-way random model without interaction (unbalanced case).
Formulas are obtained from Winer and others, "Statistical principles in
experimental design" 3rd edition, p.414; data used is from Kuehl, "Design of
Experiments", 2nd edition, p.232, example 7.1 with added observations to
make it unbalanced:

/************two-way random additive unbalanced*****************/
clear
input day machine trigly
1	1	142.3
1	1	144
1	1	141.3
1	1	142
1	2	148.6
1	2	146.9
1	3	142.9
1	3	147.4
1	4	133.8
1	4	133.2
2	1	134.9
2	1	146.3
2	2	145.2
2	2	146.3
2	3	125.9
2	3	127.6
2	4	108.9
2	4	107.5
3	1	148.6
3	1	156.5
3	2	148.6
3	2	153.1
3	3	135.5
3	3	138.9
3	4	132.1
3	4	149.7
4	1	152
4	1	151.4
4	2	149.7
4	2	152
4	3	142.9
4	3	142.3
4	3	132.3
4	4	141.7
4	4	141.2
end

capture program drop mysum
quietly program mysum, rclass
gen var_sum2=0
gen var_dblsum=0
bysort `1': replace var_sum2=_N^2 if _n==_N
quietly summ var_sum2
return scalar n2 = r(mean)*r(N)
bysort `1': replace var_sum2=_N
bysort `1' `2': replace var_dblsum=_N^2/var_sum2[_N] if _n==_N
quietly summ var_dblsum
drop var_*
return scalar n = r(mean)*r(N)
end

anova trigly day machine

quie {
if e(sample){
mysum day machine
local n2day_dot = r(n2)
local dblsum_day = r(n)

mysum machine day
local n2mach_dot = r(n2)
local dblsum_mach = r(n)

quie summ trigly
local N = r(N)

/****coefficients****/

local Caa = (`N'-`n2day_dot'/`N')/e(df_1)
local Cbb = (`N'-`n2mach_dot'/`N')/e(df_2)

local Cab = (`dblsum_day'-`n2mach_dot'/`N')/e(df_1)
local Cba = (`dblsum_mach'-`n2day_dot'/`N')/e(df_2)

local Ceb = (-`dblsum_day'+`n2mach_dot'/`N')/e(df_r)
local Cea = (-`dblsum_mach'+`n2day_dot'/`N')/e(df_r)

}


mat b = (e(ss_1)/e(df_1) , e(ss_2)/e(df_2), e(rss)/e(df_r))
mat C = (1,`Cab',`Caa'\1,`Cbb',`Cba'\1,`Cea',`Ceb')
mat v = inv(C)*b'
mat v = v \ trace(diag(v))
matrix colnames v = sigma^2
matrix rownames v = error machine day total
}

mat list v

exit


--Yulia
[email protected]



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index