Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Creating square matrix with obs counts from 2 non unique id variables
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Creating square matrix with obs counts from 2 non unique id variables
Date
Tue, 19 Apr 2011 17:44:42 +0100
This is ugly but it works.
The idea is that you clone the data and then flip the variables round
in the clone. That ensures a square matrix. You can take out the fake
entries by rounding down as each was given a very small weight.
Someone should be able to improve on this.
clear
input obs ID1 ID2
1 10 12
2 10 11
3 11 12
4 11 10
end
local N = _N
local Np1 = _N + 1
expand 2
replace ID1 = ID2[_n - `N'] in `Np1'/L
replace ID2 = ID1[_n - `N'] in `Np1'/L
gen wt = cond(_n <= `N', 1, 1e-9)
(1)
collapse (sum) wt, by(ID1 ID2)
replace wt = round(wt)
OR
(2)
tab ID1 ID2 [iw=wt], matcell(foo)
levelsof ID1, local(names)
mata : st_matrix("foo", round(st_matrix("foo")))
matrix rownames foo = `names'
matrix colnames foo = `names'
On Tue, Apr 19, 2011 at 4:47 PM, Diogo L Pinheiro
<[email protected]> wrote:
> I've been trying to create a square matrix with counts of observations from
> two non unique ID variables. In this case, one id variable has the code for
> an organization an individual was and one for where the individual is. For
> example:
>
>
> Observation ID1 ID2
> 1. 10 12
> 2. 10 11
> 3. 11 12
> 4. 11 10
>
> In the example above, observation 1 was at organization #10 at one time and
> organization #12 at another. Observation 2 was at 10 and then 11. And so on.
>
>
> From that, I want to create a square matrix that would look something like:
>
>
> 10 11 12
>
> 10 0 1 1
> 11 1 0 1
> 12 0 0 0
>
> So that I could have all the counts of each combination of organizations.
>
> The main problem is getting a square matrix in the end. I can create a non
> square matrix in this case by using
> tab ID2, gen(newID)
> collapse (sum) newID, by(ID1)
>
> but the problem is that this won't be a square matrix, as not every ID
> number shows up in both variables.. In the example above it would be missing
> a line (since not every ID number present in ID1 is present in ID2, and vice
> versa).
>
> So how could I get a square matrix from the situation above? I've found
> commands to create similar matrices, but they require unique IDs and only
> provide binary results for the existing combinations.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/