Ricardo Ovaldia <[email protected]> wrote,
> I was approached by an investigator with the following problem. He had two
> groups of experimental rats, 10 diabetic and 10 non-diabetic. Each of these
> rats had one liter of 10 pups (average). On each of the pups a series of
> biochemicals were measured. He wants me to compare the mean value of these
> biochemicals from the pumps from diabetic moms to the pups from non-diabetic
> pups. He then suggested that I do a simple t-test comparing the means of the
> two pup groups. I pointed out that the observations are not independent
> because of several pups from the same liter and that the liter effect needs
> to be taken into account.
>
> How can I set this up in Stata?
and my colleague Ken Higbee <[email protected]> showed how to do the problem
using -anova-.
What follows is really a footnote. I want to compare the results obtained
by Ken with those that would have been obtained using -regress, cluster-.
Ken generated a phony dataset and got an F statistic of 6.8. With -regress,
cluster-, I got 6.5.
Ken thoughtfully included in his posting how he generated the phony data,
which allowed me to try a different approach. I started with Ken's data:
clear
set obs 2
gen group = _n
expand 10
sort group
qui gen mom = _n in 1/10
qui replace mom = mom[_n-10] in 11/20
set seed 32981
gen z = 10 + round(uniform()*4-2,1)
expand z
drop z
bysort group mom : gen pup = _n
gen y = uniform()*8 + group
compress
which produces 190 observations on the variables
group treatment group, 1 or 2
mom mother id, 1, 2, ... 10
pup pup id, 1, 2, ..., 12
y outcome variables, continuous, [1.02, 9.77]
To obtain the ordinary t-test for the difference in means between
two groups, but using -regress-, one types
. regress y group
To obtain the test while relaxing the assumption that the observations
are independent within mother, one types
. regress y group, cluster(mom)
So here's the output:
==============================================================================
Regression with robust standard errors Number of obs = 190
F( 1, 9) = 6.44
Prob > F = 0.0319
R-squared = 0.0295
Number of clusters (mom) = 10 Root MSE = 2.3819
------------------------------------------------------------------------------
| Robust
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
group | .8258395 .3255219 2.54 0.032 .0894579 1.562221
_cons | 5.209672 .1988641 26.20 0.000 4.759811 5.659534
------------------------------------------------------------------------------
==============================================================================
The t statistic for group is 2.54, so the corresponding F is (2.54)^2 = 6.5,
which compares well with the 6.8 reported by Ken.
-- Bill
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/