Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: converting table into matrix
From
R Zhang <[email protected]>
To
[email protected]
Subject
Re: st: converting table into matrix
Date
Sat, 29 Mar 2014 15:48:52 -0400
Nick,
you are correct about stata help concerning seq(). Thank you !
my data has about 70,000 observations , i.e., 70,000 pairs of
C_industry and S_industry. For my square matrix, 70,000*70,000 would
exceed the maximum allowable dimensions in stata, is that correct?
I ran your program and got "option block() incorrectly specified", my
guess is the maximum dimension problem.
In this case, can i increase the dimension in stata?
Best,
Rochelle
On Sat, Mar 29, 2014 at 12:13 PM, Nick Cox <[email protected]> wrote:
> I don't know why you are Googling this. That is like going to the
> library to look for a book you already have. Stata itself gives you
> ways of finding out what you need to know.
>
> -help egen- and looking at the results shows that the function -seq()-
> creates indexes 1, 2, 3, ... for the rows and columns of the matrix.
> It does not calculate the dimensions of the matrix, which are
> calculated from the number of distinct values of your input string
> variables.
>
> My code assumes a square matrix with the same number of rows and columns.
> I understood from this thread and another (including a mention of
> eigenvalue calculation) that you are dealing with square matrices.
> Indeed, if you look at the code again, you should see that the number
> of rows and columns is identical and the row and column names are
> identical. So, that code cannot be used for oblong matrices (often
> loosely called rectangular).
>
> For arbitrary matrices, you would need something more like this:
>
> * !!! code not tested
>
> qui tab C_industry
> local nrows = r(r)
> qui tab S_industry
> local ncols = r(r)
>
> egen i = seq(), block(`ncols')
> egen j = seq(), to(`ncols')
>
> matrix A=J(`nrows',`ncols',.)
>
> forval n = 1/`=_N' {
> matrix A[`=i[`n']', `=j[`n']'] = x[`n']
> if C_industry[`n'] != C_industry[`=`n'-1'] {
> local rownames `rownames' `=C_industry[`n']'
> }
> if `n' <= `ncols' {
> local colnames `colnames' `=S_industry[`n']'
> }
> }
>
> matrix rownames A = `rownames'
> matrix colnames A = `colnames'
> matrix list A
>
> Nick
> [email protected]
>
>
> On 29 March 2014 15:46, R Zhang <[email protected]> wrote:
>> Thanks, Nick ! You are always so generous in helping others.
>>
>> concerning:
>>
>> egen i = seq(), block(`nvals')
>> egen j = seq(), to(`nvals')
>>
>> I did some google search and read one of your earlier posting on (
>> Generating block randomation schedule using Stata)
>>
>> would it be correct to say : you use egen to generate the dimentions
>> for the row and column of the matrix, if my matrix is 400*450, would I
>> need to change your program?
>>
>> Best,
>> Rochelle
>>
>>
>> On Sat, Mar 29, 2014 at 5:39 AM, Nick Cox <[email protected]> wrote:
>>> This can be corrected and simplified as follows, illustrating the 7th
>>> Law of Stata programming, that a shorter program needs more time. I
>>> don't repeat Rochelle's code setting up a data example.
>>>
>>> qui tab C_industry
>>> local nvals = r(r)
>>>
>>> egen i = seq(), block(`nvals')
>>> egen j = seq(), to(`nvals')
>>>
>>> matrix A=J(`nvals',`nvals',.)
>>>
>>> forval n = 1/`=_N' {
>>> matrix A[`=i[`n']', `=j[`n']'] = x[`n']
>>> if C_industry[`n'] != C_industry[`=`n'-1'] {
>>> local rownames `rownames' `=C_industry[`n']'
>>> }
>>> }
>>> matrix rownames A = `rownames'
>>> matrix colnames A = `rownames'
>>> matrix list A
>>>
>>> Nick
>>> [email protected]
>>>
>>>
>>> On 29 March 2014 01:36, Nick Cox <[email protected]> wrote:
>>>> Your main error is to overlook the fact that -encode- by default
>>>> encodes in alphanumeric order. See for example the thread started by
>>>> Michael McCulloch recently at
>>>> http://www.stata.com/statalist/archive/2014-03/msg00346.html which
>>>> underlined this point.
>>>>
>>>> There are various ways round this. One is just not to -encode-. If you
>>>> map your string values to value labels, you then have to read them
>>>> back.
>>>>
>>>> This code goes further than yours in supplying row and column names
>>>> for the matrix. The assumption is that the string variables contain
>>>> values all suitable as matrix row and column labels.
>>>>
>>>> clear all
>>>> input str20 C_industry str20 S_industry int x
>>>> Forestrysupport Forestrysupport 0
>>>> Forestrysupport Forestrynursery 0
>>>> Forestrysupport logging 0
>>>> Forestrynursery Forestrysupport 64
>>>> Forestrynursery Forestrynursery 1
>>>> Forestrynursery logging 1
>>>> logging Forestrysupport 7
>>>> logging Forestrynursery 29
>>>> logging logging 41
>>>> end
>>>>
>>>> qui tab C_industry
>>>> local nvals = r(r)
>>>>
>>>> egen i = seq(), block(`nvals')
>>>> egen j = seq(), to(`nvals')
>>>>
>>>> matrix A=J(`nvals',`nvals',.)
>>>> matrix list A
>>>>
>>>> forval n = 1/`=_N' {
>>>> matrix A[`=i[`n']', `=j[`n']'] = x[`n']
>>>> if C_industry[`n'] != C_industry[`=`n'-1'] {
>>>> local rownames `rownames' `=C_industry[`n']'
>>>> }
>>>> if `n' < `nvals' {
>>>> local colnames `colnames' `=S_industry[`n']'
>>>> }
>>>> }
>>>> matrix rownames A = `rownames'
>>>> matrix colnames A = `colnames'
>>>> matrix list A
>>>>
>>>> Nick
>>>> [email protected]
>>>>
>>>>
>>>> On 28 March 2014 23:05, R Zhang <[email protected]> wrote:
>>>>> Nick,
>>>>> I forgot to post the code. Sorry! My real data has over 400*400
>>>>> dimensions in a stata data format. that is why i can't use simple
>>>>> matrix command to input data as matrix.
>>>>>
>>>>>
>>>>> ***** my hypothetical data
>>>>> clear all
>>>>> input str20 C_industry str20 S_industry int x
>>>>> Forestrysupport Forestrysupport 0
>>>>> Forestrysupport Forestrynursery 0
>>>>> Forestrysupport logging 0
>>>>> Forestrynursery Forestrysupport 64
>>>>> Forestrynursery Forestrynursery 1
>>>>> Forestrynursery logging 1
>>>>> logging Forestrysupport 7
>>>>> logging Forestrynursery 29
>>>>> logging logging 41
>>>>> end
>>>>>
>>>>> list
>>>>>
>>>>> encode C_industry, gen(c)
>>>>> encode S_industry, gen(s)
>>>>>
>>>>>
>>>>>
>>>>> drop C_ S_
>>>>> list
>>>>>
>>>>> levelsof c, local(levs)
>>>>> local rows : word count `levs'
>>>>> matrix A=J(`rows',`rows',.)
>>>>> matrix list A
>>>>>
>>>>> forval i=1/`=_N' {
>>>>> local r=c[`i']
>>>>> local c=s[`i']
>>>>> matrix A[`r',`c']=x[`i']
>>>>> }
>>>>>
>>>>> matrix list A
>>>>>
>>>>> *******************************************
>>>>>
>>>>> my guess is that the best approach is to use a loop to input data into matrix.
>>>>>
>>>>> my original post indicates the code did not produce the matrix I
>>>>> wanted. could you please critique?
>>>>>
>>>>> thanks a lot,
>>>>>
>>>>> Rochelle
>>>>>
>>>>>
>>>>> On Fri, Mar 28, 2014 at 3:49 PM, Nick Cox <[email protected]> wrote:
>>>>>> I don't see that your code produces a matrix at all.
>>>>>>
>>>>>> Seems that you would be better off just typing it in directly.
>>>>>>
>>>>>> matrix want = (0,0,0\64,1,1\7,29,41)
>>>>>> matrix rownames want = Forestrysupport Forestrynursery logging
>>>>>> matrix colnames want = Forestrysupport Forestrynursery logging
>>>>>>
>>>>>> Nick
>>>>>> [email protected]
>>>>>>
>>>>>>
>>>>>> On 28 March 2014 19:37, R Zhang <[email protected]> wrote:
>>>>>>> Dear all,
>>>>>>>
>>>>>>> I have the following sample code to input data from stata (see below
>>>>>>> datahave) and get an output in matrix form. after that i will compute
>>>>>>> eigenvalue for this matrix.
>>>>>>>
>>>>>>> the code runs, but the output matrix has some elements misplaced. I
>>>>>>> wonder if someone could help correct it.
>>>>>>>
>>>>>>> thanks!
>>>>>>> ++++++++++++
>>>>>>> datahave
>>>>>>> clear all
>>>>>>> input str20 C_industry str20 S_industry int x
>>>>>>> Forestrysupport Forestrysupport 0
>>>>>>> Forestrysupport Forestrynursery 0
>>>>>>> Forestrysupport logging 0
>>>>>>> Forestrynursery Forestrysupport 64
>>>>>>> Forestrynursery Forestrynursery 1
>>>>>>> Forestrynursery logging 1
>>>>>>> logging Forestrysupport 7
>>>>>>> logging Forestrynursery 29
>>>>>>> logging logging 41
>>>>>>> end
>>>>>>> ++++++++++++
>>>>>>>
>>>>>>> ++++++++++++
>>>>>>> matrix want
>>>>>>> c1 c2 c3
>>>>>>> r1 0 0 0
>>>>>>> r2 64 1 1
>>>>>>> r3 7 29 41
>>>>>>>
>>>>>>>
>>>>>>> I would like to replace c1,c2,c3 with variable names Forestrysupport
>>>>>>> Forestrynursery logging
>>>>>>>
>>>>>>> -Rochelle
>>>>>>> *
>>>>>>> * For searches and help try:
>>>>>>> * http://www.stata.com/help.cgi?search
>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>>> *
>>>>>> * For searches and help try:
>>>>>> * http://www.stata.com/help.cgi?search
>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>> *
>>>>> * For searches and help try:
>>>>> * http://www.stata.com/help.cgi?search
>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>> * http://www.ats.ucla.edu/stat/stata/
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/