Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: converting table into matrix
From
R Zhang <[email protected]>
To
[email protected]
Subject
Re: st: converting table into matrix
Date
Sat, 29 Mar 2014 16:23:58 -0400
Hi ,Nick and other Statalisters
after creating the matrix, I will compute its eigenvectors.
symeigen computes eigenvectors for symmetric matrix, which means I
need to fill in some values of my matrix to make it symmetric.
my original matrix (for the sample 3* 3, the real data is 70,000*70,000)
** non-symmetric**
A[3,3]
Forestrysu~t Forestrynu~y logging
Forestrysu~t 0 0 0
Forestrynu~y 64 1 1
logging 7 29 41
if make it symmetric, it shall look like
Forestrysu~t Forestrynu~y logging
Forestrysu~t 0 64 7
Forestrynu~y 64 1 1
logging 7 29 41
my question is : how should I edit my original stata dataset in order
to create a symmetric matrix
*** data ***
clear all
input str20 C_industry str20 S_industry int x
Forestrysupport Forestrysupport 0
Forestrysupport Forestrynursery 0
Forestrysupport logging 0
Forestrynursery Forestrysupport 64
Forestrynursery Forestrynursery 1
Forestrynursery logging 1
logging Forestrysupport 7
logging Forestrynursery 29
logging logging 41
end
*** Nick's code - it works (but I need help with high dimensional data
70,000*70,000) **
qui tab C_industry
local nvals = r(r)
egen i = seq(), block(`nvals')
egen j = seq(), to(`nvals')
matrix A=J(`nvals',`nvals',.)
forval n = 1/`=_N' {
matrix A[`=i[`n']', `=j[`n']'] = x[`n']
if C_industry[`n'] != C_industry[`=`n'-1'] {
local rownames `rownames' `=C_industry[`n']'
}
}
matrix rownames A = `rownames'
matrix colnames A = `rownames'
matrix list A
*** A is nonsymmetric ***
thanks !
Rochelle
On Sat, Mar 29, 2014 at 3:48 PM, R Zhang <[email protected]> wrote:
> Nick,
> you are correct about stata help concerning seq(). Thank you !
>
> my data has about 70,000 observations , i.e., 70,000 pairs of
> C_industry and S_industry. For my square matrix, 70,000*70,000 would
> exceed the maximum allowable dimensions in stata, is that correct?
>
> I ran your program and got "option block() incorrectly specified", my
> guess is the maximum dimension problem.
>
> In this case, can i increase the dimension in stata?
>
> Best,
>
> Rochelle
>
> On Sat, Mar 29, 2014 at 12:13 PM, Nick Cox <[email protected]> wrote:
>> I don't know why you are Googling this. That is like going to the
>> library to look for a book you already have. Stata itself gives you
>> ways of finding out what you need to know.
>>
>> -help egen- and looking at the results shows that the function -seq()-
>> creates indexes 1, 2, 3, ... for the rows and columns of the matrix.
>> It does not calculate the dimensions of the matrix, which are
>> calculated from the number of distinct values of your input string
>> variables.
>>
>> My code assumes a square matrix with the same number of rows and columns.
>> I understood from this thread and another (including a mention of
>> eigenvalue calculation) that you are dealing with square matrices.
>> Indeed, if you look at the code again, you should see that the number
>> of rows and columns is identical and the row and column names are
>> identical. So, that code cannot be used for oblong matrices (often
>> loosely called rectangular).
>>
>> For arbitrary matrices, you would need something more like this:
>>
>> * !!! code not tested
>>
>> qui tab C_industry
>> local nrows = r(r)
>> qui tab S_industry
>> local ncols = r(r)
>>
>> egen i = seq(), block(`ncols')
>> egen j = seq(), to(`ncols')
>>
>> matrix A=J(`nrows',`ncols',.)
>>
>> forval n = 1/`=_N' {
>> matrix A[`=i[`n']', `=j[`n']'] = x[`n']
>> if C_industry[`n'] != C_industry[`=`n'-1'] {
>> local rownames `rownames' `=C_industry[`n']'
>> }
>> if `n' <= `ncols' {
>> local colnames `colnames' `=S_industry[`n']'
>> }
>> }
>>
>> matrix rownames A = `rownames'
>> matrix colnames A = `colnames'
>> matrix list A
>>
>> Nick
>> [email protected]
>>
>>
>> On 29 March 2014 15:46, R Zhang <[email protected]> wrote:
>>> Thanks, Nick ! You are always so generous in helping others.
>>>
>>> concerning:
>>>
>>> egen i = seq(), block(`nvals')
>>> egen j = seq(), to(`nvals')
>>>
>>> I did some google search and read one of your earlier posting on (
>>> Generating block randomation schedule using Stata)
>>>
>>> would it be correct to say : you use egen to generate the dimentions
>>> for the row and column of the matrix, if my matrix is 400*450, would I
>>> need to change your program?
>>>
>>> Best,
>>> Rochelle
>>>
>>>
>>> On Sat, Mar 29, 2014 at 5:39 AM, Nick Cox <[email protected]> wrote:
>>>> This can be corrected and simplified as follows, illustrating the 7th
>>>> Law of Stata programming, that a shorter program needs more time. I
>>>> don't repeat Rochelle's code setting up a data example.
>>>>
>>>> qui tab C_industry
>>>> local nvals = r(r)
>>>>
>>>> egen i = seq(), block(`nvals')
>>>> egen j = seq(), to(`nvals')
>>>>
>>>> matrix A=J(`nvals',`nvals',.)
>>>>
>>>> forval n = 1/`=_N' {
>>>> matrix A[`=i[`n']', `=j[`n']'] = x[`n']
>>>> if C_industry[`n'] != C_industry[`=`n'-1'] {
>>>> local rownames `rownames' `=C_industry[`n']'
>>>> }
>>>> }
>>>> matrix rownames A = `rownames'
>>>> matrix colnames A = `rownames'
>>>> matrix list A
>>>>
>>>> Nick
>>>> [email protected]
>>>>
>>>>
>>>> On 29 March 2014 01:36, Nick Cox <[email protected]> wrote:
>>>>> Your main error is to overlook the fact that -encode- by default
>>>>> encodes in alphanumeric order. See for example the thread started by
>>>>> Michael McCulloch recently at
>>>>> http://www.stata.com/statalist/archive/2014-03/msg00346.html which
>>>>> underlined this point.
>>>>>
>>>>> There are various ways round this. One is just not to -encode-. If you
>>>>> map your string values to value labels, you then have to read them
>>>>> back.
>>>>>
>>>>> This code goes further than yours in supplying row and column names
>>>>> for the matrix. The assumption is that the string variables contain
>>>>> values all suitable as matrix row and column labels.
>>>>>
>>>>> clear all
>>>>> input str20 C_industry str20 S_industry int x
>>>>> Forestrysupport Forestrysupport 0
>>>>> Forestrysupport Forestrynursery 0
>>>>> Forestrysupport logging 0
>>>>> Forestrynursery Forestrysupport 64
>>>>> Forestrynursery Forestrynursery 1
>>>>> Forestrynursery logging 1
>>>>> logging Forestrysupport 7
>>>>> logging Forestrynursery 29
>>>>> logging logging 41
>>>>> end
>>>>>
>>>>> qui tab C_industry
>>>>> local nvals = r(r)
>>>>>
>>>>> egen i = seq(), block(`nvals')
>>>>> egen j = seq(), to(`nvals')
>>>>>
>>>>> matrix A=J(`nvals',`nvals',.)
>>>>> matrix list A
>>>>>
>>>>> forval n = 1/`=_N' {
>>>>> matrix A[`=i[`n']', `=j[`n']'] = x[`n']
>>>>> if C_industry[`n'] != C_industry[`=`n'-1'] {
>>>>> local rownames `rownames' `=C_industry[`n']'
>>>>> }
>>>>> if `n' < `nvals' {
>>>>> local colnames `colnames' `=S_industry[`n']'
>>>>> }
>>>>> }
>>>>> matrix rownames A = `rownames'
>>>>> matrix colnames A = `colnames'
>>>>> matrix list A
>>>>>
>>>>> Nick
>>>>> [email protected]
>>>>>
>>>>>
>>>>> On 28 March 2014 23:05, R Zhang <[email protected]> wrote:
>>>>>> Nick,
>>>>>> I forgot to post the code. Sorry! My real data has over 400*400
>>>>>> dimensions in a stata data format. that is why i can't use simple
>>>>>> matrix command to input data as matrix.
>>>>>>
>>>>>>
>>>>>> ***** my hypothetical data
>>>>>> clear all
>>>>>> input str20 C_industry str20 S_industry int x
>>>>>> Forestrysupport Forestrysupport 0
>>>>>> Forestrysupport Forestrynursery 0
>>>>>> Forestrysupport logging 0
>>>>>> Forestrynursery Forestrysupport 64
>>>>>> Forestrynursery Forestrynursery 1
>>>>>> Forestrynursery logging 1
>>>>>> logging Forestrysupport 7
>>>>>> logging Forestrynursery 29
>>>>>> logging logging 41
>>>>>> end
>>>>>>
>>>>>> list
>>>>>>
>>>>>> encode C_industry, gen(c)
>>>>>> encode S_industry, gen(s)
>>>>>>
>>>>>>
>>>>>>
>>>>>> drop C_ S_
>>>>>> list
>>>>>>
>>>>>> levelsof c, local(levs)
>>>>>> local rows : word count `levs'
>>>>>> matrix A=J(`rows',`rows',.)
>>>>>> matrix list A
>>>>>>
>>>>>> forval i=1/`=_N' {
>>>>>> local r=c[`i']
>>>>>> local c=s[`i']
>>>>>> matrix A[`r',`c']=x[`i']
>>>>>> }
>>>>>>
>>>>>> matrix list A
>>>>>>
>>>>>> *******************************************
>>>>>>
>>>>>> my guess is that the best approach is to use a loop to input data into matrix.
>>>>>>
>>>>>> my original post indicates the code did not produce the matrix I
>>>>>> wanted. could you please critique?
>>>>>>
>>>>>> thanks a lot,
>>>>>>
>>>>>> Rochelle
>>>>>>
>>>>>>
>>>>>> On Fri, Mar 28, 2014 at 3:49 PM, Nick Cox <[email protected]> wrote:
>>>>>>> I don't see that your code produces a matrix at all.
>>>>>>>
>>>>>>> Seems that you would be better off just typing it in directly.
>>>>>>>
>>>>>>> matrix want = (0,0,0\64,1,1\7,29,41)
>>>>>>> matrix rownames want = Forestrysupport Forestrynursery logging
>>>>>>> matrix colnames want = Forestrysupport Forestrynursery logging
>>>>>>>
>>>>>>> Nick
>>>>>>> [email protected]
>>>>>>>
>>>>>>>
>>>>>>> On 28 March 2014 19:37, R Zhang <[email protected]> wrote:
>>>>>>>> Dear all,
>>>>>>>>
>>>>>>>> I have the following sample code to input data from stata (see below
>>>>>>>> datahave) and get an output in matrix form. after that i will compute
>>>>>>>> eigenvalue for this matrix.
>>>>>>>>
>>>>>>>> the code runs, but the output matrix has some elements misplaced. I
>>>>>>>> wonder if someone could help correct it.
>>>>>>>>
>>>>>>>> thanks!
>>>>>>>> ++++++++++++
>>>>>>>> datahave
>>>>>>>> clear all
>>>>>>>> input str20 C_industry str20 S_industry int x
>>>>>>>> Forestrysupport Forestrysupport 0
>>>>>>>> Forestrysupport Forestrynursery 0
>>>>>>>> Forestrysupport logging 0
>>>>>>>> Forestrynursery Forestrysupport 64
>>>>>>>> Forestrynursery Forestrynursery 1
>>>>>>>> Forestrynursery logging 1
>>>>>>>> logging Forestrysupport 7
>>>>>>>> logging Forestrynursery 29
>>>>>>>> logging logging 41
>>>>>>>> end
>>>>>>>> ++++++++++++
>>>>>>>>
>>>>>>>> ++++++++++++
>>>>>>>> matrix want
>>>>>>>> c1 c2 c3
>>>>>>>> r1 0 0 0
>>>>>>>> r2 64 1 1
>>>>>>>> r3 7 29 41
>>>>>>>>
>>>>>>>>
>>>>>>>> I would like to replace c1,c2,c3 with variable names Forestrysupport
>>>>>>>> Forestrynursery logging
>>>>>>>>
>>>>>>>> -Rochelle
>>>>>>>> *
>>>>>>>> * For searches and help try:
>>>>>>>> * http://www.stata.com/help.cgi?search
>>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>>>> *
>>>>>>> * For searches and help try:
>>>>>>> * http://www.stata.com/help.cgi?search
>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>>> *
>>>>>> * For searches and help try:
>>>>>> * http://www.stata.com/help.cgi?search
>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>> *
>>>> * For searches and help try:
>>>> * http://www.stata.com/help.cgi?search
>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> * http://www.ats.ucla.edu/stat/stata/
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/