John Bunge <[email protected]> writes,
> I think I was misunderstood by Bill and want to make my problem more
> explicit by stylizing the dataset I have:
>
>
> list
> [...]
> [...]
> The correlation coefficients (cc's) for the decisions I want to compute are:
>
> between country 1 and 2, between country 1 and 3, ..., between country 1 and
> 200,... between country 199 and 200, respectively. The total number of cc's
> will be (200*199)/2 = 19,900.
>
> Now note that I need these coefficients for every single year, not over all
> decisions during the whole time period 1980 - 1999. So in the end, I will
> have the coefficients for the country-pair 1-2 (and for all other country
> pairs, too) for 1980, for 1981, ..., and for 1999. That is, in the end I
> will have 19,900*20 = 398,000 coefficients.
Okay. I omitted from the quote above the data that John supplied, but here
they are:
. list
+-------------------------+
| cid deid year dec |
|-------------------------|
1. | 1 1 1980 -1 |
2. | 1 2 1980 0 |
3. | 1 3 1980 1 |
4. | 1 4 1980 1 |
5. | 1 4000 1999 -1 |
|-------------------------|
6. | 2 1 1980 0 |
7. | 2 2 1980 -1 |
8. | 2 3 1980 0 |
9. | 2 4 1980 1 |
10. | 2 4000 1999 -1 |
|-------------------------|
11. | 200 1 1980 -1 |
12. | 200 2 1980 0 |
13. | 200 3 1980 0 |
14. | 200 4 1980 -1 |
15. | 200 4000 1999 1 |
+-------------------------+
John made clear, here wants correlations of dec between countries, not between
years, so this time I'm going to make the dataset wide across countries:
. reshape wide dec, i(deid year) j(cid)
(note: j = 1 2 200)
Data long -> wide
---------------------------------------------------------------------
Number of obs. 15 -> 5
Number of variables 4 -> 5
j variable (3 values) cid -> (dropped)
xij variables:
dec -> dec1 dec2 dec200
---------------------------------------------------------------------
. list
+------------------------------------+
| deid year dec1 dec2 dec200 |
|------------------------------------|
1. | 1 1980 -1 0 -1 |
2. | 2 1980 0 -1 0 |
3. | 3 1980 1 0 0 |
4. | 4 1980 1 1 -1 |
5. | 4000 1999 -1 -1 1 |
+------------------------------------+
Now I can obtain the the correlations, say for 1980, by typing
. correlate dec1-dec200 if year==1980
(obs=4)
| dec1 dec2 dec200
-------------+---------------------------
dec1 | 1.0000
dec2 | 0.4264 1.0000
dec200 | 0.3015 -0.7071 1.0000
Obviously, if I had all the data, I'd have gotten a much larger correlation
matrix.
John's about to calculate a lot of correlations. As he said, for each year he
will have (200*199)/2 = 19,900 correlations, and for 20 years, he will
have a total of 398,000.
Perhaps seeing them printed is good enough, but I'm guessing John is next
going to ask, "How do I get them in a dataset?" SO let's set about creating a
dataset that looks like
year dec_i dec_j rho
-------------------------------
1980 1 2 .4264
1980 1 3 ...
1980 1 . .
1980 1 . .
1980 1 200 .3015
1980 2 3 ...
1980 2 . .
1980 2 . .
1980 2 200 -.7071
. . . .
. . . .
-------------------------------
Here's how:
program rhos
version 10
postfile results year dec_i dec_j rho using rhos.dta, replace
forvalues year = 1980(1)1999 {
forvalues i=1(1)200 {
local j0 = `i' + 1
forvalues j=`j0'(1)200 {
quietly corr dec`i' dec`j' if year==`year'
post results (`year') (`i') (`j') (r(rho))
}
}
}
postclose results
display as txt "done -- data in rhos.dta"
end
I haven't tested this program, but it seems to me it ought to work.
I do not expect the program to be fast -- we are going to run 398,000
separate -correlate- commands -- but it shouldn't take too long.
Run this program on the wide data. Results will be put in rhos.dta.
I suggest John package the whole thing as a do-file. I know there will
be mistakes -- mine or John's -- and it will be a lot easier to fix the
do-file than to keep starting over again interactively:
----------------------------------------- doit.do ---
version 10
clear all
use johnsdata, clear
reshape wide dec, i(deid year) j(cid)
program rhos
version 10
postfile results year dec_i dec_j rho using rhos.dta, replace
forvalues year = 1980(1)1999 {
forvalues i=1(1)200 {
local j0 = `i' + 1
forvalues j=`j0'(1)200 {
quietly corr dec`i' dec`j' if year==`year'
post results (`year') (`i') (`j') (r(rho))
}
}
}
postclose results
display as txt "done -- data in rhos.dta"
end
rhos
use rhos, clear
list in 1/5
----------------------------------------- doit.do ---
-- Bill
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/