Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Re: Correlation of repeated baseline measures in sampsi
From
"Seed, Paul" <[email protected]>
To
"[email protected]" <[email protected]>, "[email protected]" <[email protected]>
Subject
st: Re: Correlation of repeated baseline measures in sampsi
Date
Thu, 24 Mar 2011 11:16:30 +0000
Helen Connolly wrote:
I'm using the sampsi command to calculate sample size for a two-sample
study with 12 monthly baseline measurements and 12 monthly follow-up
measurements. I have data from a previous study and can calculate mean
and standard deviations for both samples.
My question is, how do I calculate correlation of baseline measurements
(r0), correlation of follow-up measurements (r1), and correlation
between baseline and follow-up measurements (r01)? These require a
single measure (not a covariance matrix). In all the references I have
seen, there are statements like "correlation of baseline measurements
calculated from previous study", but I see no reference as to how this
is done. Can someone please help?
*********************************************
This portion of -sampsi- is derived from Frison L & Pocock SJ (1992)
The original paper assumed that only limited data was available from the
sample dataset - means, SDs, and correlations. There are more choices
when you have the full data set to try out.
I wrote this part of the command in 1997 as -sampsi2-, and it was
incorporated as official Stata shortly afterwards.
The correlations between repeated measurements can be found
by setting the data in wide format and carrying out correlations
between the repeated measures.
First, get the full correlation matrix.
You divide this into 3 sets of correlations. Set R00 contains only correlations
between two baseline measures. Set R01 is between one baseline
& one follow-up measure, and set R11 is between follow-up measures only.
If any set has only one member, your problem is solved. Likewise if the set is empty;
do not use that option. Otherwise, Frison & Pocock advise using the average
(simple arithmetic mean) of the correlations in a set, assuming they are not too far apart.
An alternative approach is to average all the baseline measures and follow-up measures
in you example data set, and use only the one correlation between baseline & follow-up.
You Standard Deviation for the outcome measures should also be adjusted.
References:
As Stata manual
**************** Example *************************
* Create a suitable dataset
use http://www.stata-press.com/data/r11/nlswork, clear
xtdes
tab year
keep if year < =73
keep if ln_wage <.
bys id: keep if _N == 6
tab year
keep ln_wage idcode year
reshape wide ln_wage, i( idcode) j(year)
* Obtain the full matrix
corr ln*
* Assuming 2 baseline & 4 follow-up measures,
* R00 = { 0.6677}, R01 = { 0.6320 0.8301 0.5863 0.7513 0.5517 0.6687}
* and R11 = { 0.7981 0.7143 0.8117 0.6504 0.7622 0.8779}
di (0.6320 + 0.8301 + 0.5863 + 0.7513 + 0.5517 + 0.6687)/6
di ( 0.7981 +0.7143 +0.8117 +0.6504 +0.7622 +0.8779)/6
* So, r00 = 0.6677, r01 is about 0.670, and r11 is 0.679
* Or, you can work with the averages of the individual values,
* so that you have only one correlation to find.
egen ln_wage_bl = rmean( ln_wage68 ln_wage69)
egen ln_wage_fup = rmean( ln_wage70 ln_wage71 ln_wage72 ln_wage73)
corr ln_wage_bl ln_wage_fup
su ln_wage_fup
exit
********************************************************
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/