David Airey <[email protected]> asks:
> How do you know before running a model, what the matsize needs to be?
> Is there an exact size you can determine beforehand?
>
> I'm going to work with a complex ANOVA and want to figure out the
> matsize needed, to see if Stata 8/SE can handle it, if I need to
> simplify the design, or if I need to purchase additional RAM. My
> computer can have 1000 mb max.
>
> ANOVA model:
>
> between subject factors:
> A: 20 levels
> B: 2 levels
>
> random subject factor nested in A and B:
> S: 400 animals total, 20 per A level, 10 per B level
>
> within subject factors (all crossed):
> C: 3 levels
> D: 4 levels
> E: 3 levels
> F: 2 levels
>
> In the mean time, I'm working out EMSs and a fabricated data set to see
> what happens on my machine.
For the benefit of others (I know David has seen it already) --
look at the example at
http://www.stata.com/support/faqs/stat/anova2.html#expand911
which shows a complicated repeated measures ANOVA. In that
particular case, I mentioned (without supporting justification)
that I needed to set matsize to 449 to run that particular
-anova-. Where did that number come from?
1 the constant
+ 2 A with 2 levels
+ 4 G|A with a total of 4 levels (2*2)
+ 2 B with 2 levels
+ 4 B*A (2*2)
+ 8 B*G|A (2*2*2)
+ 16 S|B*G|A (2*2*2*2)
+ 3 C with 3 levels
+ 6 C*A (3*2)
+ 12 C*G|A (3*2*2)
+ 6 C*B (3*2)
+ 12 C*B*A (3*2*2)
+ 24 C*B*G|A (3*2*2*2)
+ 48 C*S|B*G|A (3*2*2*2*2)
+ 3 D with 3 levels
+ 6 D*A (3*2)
+ 12 D*G|A (3*2*2)
+ 6 D*B (3*2)
+ 12 D*B*A (3*2*2)
+ 24 D*B*G|A (3*2*2*2)
+ 48 D*S|B*G|A (3*2*2*2*2)
+ 9 D*C (3*3)
+ 18 D*C*A (3*3*2)
+ 36 D*C*G|A (3*3*2*2)
+ 18 D*C*B (3*3*2)
+ 36 D*C*B*A (3*3*2*2)
+ 72 D*C*B*G|A (3*3*2*2*2)
-----
= 448
At this moment, I don't remember if I actually needed 449 (as I
claimed in the FAQ or 448 as computed above). I think the 448
should be large enough.
You can follow the same exercise for your example and possibly
add 1 just for safe measure. Write down your model and then for
each term multiply the number of levels for each factor in the
term, then add them all up.
When you compare the numbers from doing this to the degrees of
freedom for each of the terms, it becomes clear real quickly why
they call it the "overparameterized ANOVA model".
With your particular case it doesn't look like you can get a
S|A*B term (I am assuming A is crossed with B). You say A has 20
levels and B has 2 and that there are 400 animals total. Since
20*2 = 400, I guess that means you have one animal per a A*B
combination. So you will not be able to estimate a S|A*B term
separate from the A*B term. Maybe you will drop the A*B term
(and assume that the A*B interaction is insignificant).
I commend the idea of creating an example dataset and doing a dry
run of your analysis before collecting the data. This is helpful
in complicated designs to help point out limitations or problems
you might run into. In some cases it might set you back to
rethinking how you want to design your experiment.
Ken Higbee [email protected]
StataCorp 1-800-STATAPC
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/