David L. Van Brunt asked about setting up a statistical model of cage-crossing data
from a study of the age- and sex-dependent response to amphetamine in four strains of
mice. In a study conducted by his colleague, cage crossing is measured after treatment
with each dose of amphetamine administered at each of two periods of postnatal
development for each of ten mice of each sex from each of four strains.
The do-file below illustrates (using artificial data) the -anova- command line that I
believe will work in this study design, if I haven't stumbled over the error terms for use
the lower-level interactions. The statistical model of the data is as for a randomized
complete blocks design, acknowledging that the only factor that can be randomized in
this study is the sequence of the amphetamine treatment. The model does not include
a factor for sequence effects of prior exposure to drug.
I'm not sure whether David's colleague has scientific interest in all of the factors'
interactions, but I recommend including them to flesh out the statistical model of the
data with all of the interaction terms implied by the study's design. Some authorities
recommend leaving all of the scientifically cogent interaction terms in the statistical
model and estimating it just once, rather than dropping terms that don't attain statistical
significance and testing the hypothesis of interest in a reduced model.
The default option for Stata's -anova- is to use SAS Type III sums of squares, just like
for SPSS, I believe, which his colleague is trying to use now. But if the scientific
hypothesis that David's colleague is putting to the test warrants using SAS Type I or II
sums of squares, he or she should be reassured that Stata also has an option for this
just as SPSS does.
Cage-crossings might not have homoscedastic errors, especially between levels of the
amphetamine treatment factor. The data might not even be adequately normally
distributed for ANOVA, since they seem to be counts recorded during a behavior
observation session. Stata has alternatives to -anova- that might be more appropriate
in light of the qualities of cage-crossing data, e.g., -xtgls-, or -xtpois-, -xtnbreg- and -
xtgee-. If David's colleague is interested in testing main effects or lower-order
interaction terms in the presence of a nonnegligible-but-scientifically-uninteresting
higher interaction, Stata can the analogue of SAS Type III SS in Wald tests in any of
these regression commands by using -desmat- or -xi2-.
Joseph Coveney
---------------------begin illustration.do----------------------------------
clear
set obs 80
set seed 20030116
set more off
*
* stn is strain (four strains, categorical)
* mid is mouse ID (ten per strain per sex)
* age is pre- or postpuberty (categorical)
* dos is dose (low/high, or vehicle/amphetamine, categorical)
* cro is cage crossings (counts, in a boxcar distribution)
*
egen byte stn=fill(1 2 3 4 1 2 3 4)
sort stn
generate byte sex=mod(_n, 2)
sort stn sex
generate byte age=mod(_n, 2)
egen byte mid=fill(1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4)
sort mid age
generate byte dos=mod(_n, 2)
generate byte cro=round(uniform()*100, 1)
*
anova cro stn sex stn*sex / mid | stn*sex age stn*age /*
*/ sex*age stn*sex*age / mid | stn*sex*age dos stn*dos /*
*/ sex*dos stn*sex*dos / mid | stn*sex*dos age*dos stn*age*dos /*
*/ sex*age*dos stn*sex*age*dos, repeated(age dos)
pause on
rvfplot, xlabel ylabel yline(0)
pause
foreach var of varlist stn sex age dos {
rvpplot `var', xlabel ylabel
pause
}
hettest
foreach var of varlist stn sex age dos {
display in smcl as result "`var'"
hettest `var'
}
ovtest
exit
-----------------------end illustration.do----------------------------------
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/