Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Re: st: RE: Selecting correlations with highest absolute value
From
Joe Canner <[email protected]>
To
"[email protected]" <[email protected]>
Subject
RE: Re: st: RE: Selecting correlations with highest absolute value
Date
Wed, 9 Oct 2013 23:40:20 +0000
Dara,
Red Owl beat me to the answer I was going to give. If you have a good reason to use -pwcorr- instead of -corr-, then you might need something more complicated in which you loop over all your variables, accumulating pairwise correlations.
foreach x of varlist tbmale...etc {
foreach y of varlist tbmale...etc {
corr `x' 'y'
matrix corrvector=corrvector \ vec(r(C))
}
}
matvsort corrvector sortedvector
matrix list sortedvector
I don't have the ability to test this at the moment and I don't have matrix syntax memorized, so this might need some tweaking, particularly the matrix command inside the loops. I'm also not sure if you will need to initialize -correvector- before starting the loops. Let us know if have any problems and I'm sure someone can help.
Joe
________________________________________
From: [email protected] [[email protected]] on behalf of Dara Shifrer [[email protected]]
Sent: Wednesday, October 09, 2013 7:04 PM
To: [email protected]
Subject: Fwd: Re: st: RE: Selecting correlations with highest absolute value
Joe, thank you very much for your quick response to my quest to find the
most highly correlated pairs of variables. I think I understand what
your code does (finds correlations, linearly transforms the correlation
matrix into a column vector, sorts this matrix, and then lists the
sorted columns of correlations) but I'm not sure why it isn't working
for me (see code below). I haven't used Stata's matrix commands before
and may be missing something obvious. Thanks for any additional help
anyone can provide! Dara
pwcorr tbmale tdedc3 tbrace td9tchr td9slry tb9yrsh tb9yrsnh td10tchr
td10slry ///
tb10yrsh tb10yrsnh td11tchr td11slry tb11yrsh tb11yrsnh ///
tp10pswm ta10a2w skd10size skd10blck skd10hisp skd10pvty skd10lep ///
skd10biesl skd10gt skd10sped skd11size skd11blck skd11hisp skd11pvty
skd11lep ///
skd11biesl skd11gt skd11sped skd12size skd12blck skd12hisp skd12pvty
skd12lep ///
skd12biesl skd12gt skd12sped ta11elgb5 ta11ctgr ta11grd ta11chrt
ta11sclvl ///
ta11a1rg ta11a2rg ta11a2lrg ta11a2mrg ta11a2m9rg ta11a2m10rg ta11a2m11rg ///
ta11a2rrg ta11a2r9rg ta11a2r10rg ta11a2r11rg ta11a2srg ta11a2s10rg
ta11a2s11rg ///
ta11a2ssrg ta11a2ss10rg ta11a2ss11rg ///
ta11a3rg ta11a3arg ta11a3arrg ta11a3amrg ta11a3aparg ta11a3aperg
ta11a3brg ta11a3crg ///
trt12rtn tp12pswm tka12tme tka12tmebl tka12tms tka12tmsbl tka12tre
tka12trebl tka12trs ///
tka12trsbl tka12talg1 tka12talg1bl tka12tbio tka12tbiobl tka12te1r ///
tka12te1rbl tka12te1w tka12te1wbl tka12twgeo tka12twgeobl ///
tka12smegn tka12smsgn tka12sregn tka12srsgn tka12slegn tka12slsgn ///
tka12ssegn tka12shegn tka12shsgn
.... lots of correlations excluded...
| tka~megn tka~msgn tka~regn tka~rsgn tka~legn tka~lsgn tka~segn
-------------+---------------------------------------------------------------
tka12smegn | 1.0000
tka12smsgn | 0.1390 1.0000
tka12sregn | 0.6082 0.1509 1.0000
tka12srsgn | 0.1211 0.5660 0.1929 1.0000
tka12slegn | 0.5454 -0.0638 0.5637 0.1009 1.0000
tka12slsgn | 0.2572 0.5671 0.2427 0.5295 0.2006 1.0000
tka12ssegn | 0.4479 -0.1376 0.3819 -0.1273 0.4028 -0.1095
1.0000
tka12shegn | 0.4143 0.0340 0.4330 -0.2011 0.4543 -0.2584
0.5530
tka12shsgn | 0.5705 0.4077 0.3127 0.6170 0.2309 0.4094
0.2407
| tka~hegn tka~hsgn
-------------+------------------
tka12shegn | 1.0000
tka12shsgn | 0.0918 1.0000
. matrix corrvector=vec(r(C))
. matvsort corrvector sortedvector
. matrix list sortedvector
sortedvector[4,1]
c1
tka12shsgn:tka12shsgn 1
tka12shsgn:tka12shsgn 1
tka12shsgn:tka12shsgn 1
tka12shsgn:tka12shsgn 1
Postdoctoral Fellow, Houston Education Research Consortium
Kinder Institute for Urban Research
Rice University
[email protected]
On 10/8/2013 1:39 PM, Joe Canner wrote:
> Dara,
>
> Here's one quick-n-dirty possibility. (It requires installing -matvsort- from SSC.)
>
> . corr varlist
> . matrix corrvector=vec(r(C))
> . matvsort corrvector sortedvector
> . matrix list sortedvector
>
> Regards,
> Joe Canner
> Johns Hopkins University School of Medicine
>
>
> -----Original Message-----
> From:[email protected] [mailto:[email protected]] On Behalf Of Dara Shifrer
> Sent: Tuesday, October 08, 2013 3:16 PM
> To:[email protected]
> Subject: st: Selecting correlations with highest absolute value
>
>
> In SAS, I was able to quickly determine which pairs of variables were
> most highly correlated using the 'best' option with the 'proc corr'
> command ("*BEST=*/n ----/**/**/prints */n/* correlation coefficients for
> each variable. Correlations are ordered from highest to lowest in
> absolute value.) After extensive searching, I have not been able to
> locate a Stata command that does something similar.
>
> If this is not possible in Stata, maybe Stata experts have suggestions
> for my ultimate purpose: constructing equations to facilitate a smoother
> and faster running of Stata's 'ice' command.
>
> Any help would be greatly appreciated,
> Dara Shifrer
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/