Hello All ---
I am using Intercooled Stata 8.2.
I have a large dataset that contains 2,388 unique subjects where
each subject has a measurement (or missing value) at 8 distinct
time points & each subject is categorized into one of four drug
groups. My objective is to determine if my outcome varies
w/respect to drug type. Per the -anova- syntax requirement, I
reshape the data to long form with a resultant 19,104
'observations' (2,388 multiplied by 8); among the 19,104
records, 11,719 do NOT contain missing values for my outcome
variable (in short: unbalanced design). My question concerns
the execution of the repeated measures and subsequent
determination of a proper error term. If I use the following
code, Stata returns an error (also pasted):
-anova pdc pt cohort1 qtr, repeated(qtr)-
too many variables or values
where 'pt' is the subject identifier, 'cohort1' is drug type
(among 4), and 'qtr' is the timepoint (quarter of year: 1-8).
If I eliminate 'pt' from the syntax & specify -bseunit- with
'pt' as the argument, the anova executes fine (I think):
-anova pdc cohort1 qtr, repeated(qtr) bseunit(pt)-
The ANOVA output indicates that the residual is a function of
'pt' (df of residual is 11,708); herein lies my question: is
this logical? And if so, why did the first syntax presented not
work even when I maximized matsize? Note that the df of
'cohort1' & 'qtr' are logical (3 & 7, respectively).
Parenthetically, the above ANOVA indicates that my outcome does
indeed differ across drug type. I am relatively new to repeated
measures analysis so I may be overlooking an important
component/assumption of this technique and if so, can you please
elaborate or point me in the right direction?
Any help or suggestions are greatly appreciated!!
Many thanks, Clint Thompson
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/