> -----Original Message-----
> From: Roger Harbord [mailto:[email protected]]
> Sent: Wednesday, July 10, 2002 3:44 PM
> To: [email protected]
> Subject: st: correlate by group and collapse
>
> Dear Statalisters,
>
> I want to collapse my dataset by a group variable and retain the
> correlation coefficient of two variables. In other
> words, I'd like to be able to do something like:
> . collapse (correlation) var1 var2, by(group)
> or maybe:
> . by group: egen corr12=corr(var1 var2)
> . collapse corr12, by(group)
>
> However, collapse doesn't have correlation among its stats (it only
> allows a selection of univariate statistics) and egen doesn't have a
> corr function.
> I know I can do:
> . by group: correlate var1 var2
> - but I want to save the results and do further analysis on
> them rather
> than just displaying them.
>
> The best I've come up with is (supposing I have 100 groups):
> . gen corr12=.
> . for num 1/100, noheader: qui correlate var1 var2 if group==X \
> qui replace corr12=r(rho) if study==X
> . collapse corr12, by(group)
>
> This seems kind of clumsy though, and it took me a while to work out
> that I needed _noheader_ and _quietly_ to stop my screen filling with
> output. It also becomes quite lengthy if I want several pairwise
> correlations. Is there a better way?
>
> I think I'd like egen to have a _corr_ and/or a _cov_ function - I
> would have thought it would be of wider interest than the calculation
> of U.S. marginal income tax rates, which is already
> implemented as egen
> function mtr! I've checked the extensions to egen in the STB package
> _egenodd_ and tried a couple of _findit_'s, but I didn't find
> anything
> suitable.
I've attached below a program to do this with egen. Save the whole
thing as "_gcorr.ado" (that is, DO NOT separate out the GenCorr part as
a separate file.
The syntax is:
[by varlist:] egen newvar = var1 var2 [if exp] [in exp] [ ,
covariance ]
The ", covariance" option generates coveriances; otherwise it does
correlations.
Nick Winter
**************** BEGINNING OF _gcorr.ado
**************************************
*! NJGW 10jul2002
*! syntax: [by varlist:] egen newvar = var1 var2 [if exp] [in exp] [ ,
covariance ]
*! computes correlation (or covariance) between var1 and var2,
optionally by: varlist
*! and stores the result in newvar.
program define _gcorr
version 7
gettoken type 0 : 0
gettoken g 0 : 0
gettoken eqs 0 : 0
syntax varlist(min=2 max=2) [if] [in] [, BY(string) Covariance ]
if `"`by'"'!="" {
local by `"by `by':"'
}
quietly {
gen `type' `g' = .
`by' GenCorr `varlist' `if' `in', thevar(`g')
`covariance'
}
capture label var `g' "Correlation `varlist'"
end
program define GenCorr, byable(recall)
syntax varlist [if] [in] , thevar(string) [ covariance ]
marksample touse
if "`covariance'"=="" {
local stat "r(rho)"
}
else {
local stat "r(cov_12)"
}
cap corr `varlist' if `touse' , `covariance'
if !_rc {
qui replace `thevar'=``stat'' if `touse'
}
end
**************** END OF _gcorr.ado
**************************************
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/