Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Breaking huge lines and creating variables
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Breaking huge lines and creating variables
Date
Fri, 30 Sep 2011 23:53:58 +0100
Does look like a Mata job.
But it seems unlikely that you are the first person to work with such
data, so asking what formats people use sounds best.
Nick
On Fri, Sep 30, 2011 at 9:08 PM, Pedro Nakashima
<[email protected]> wrote:
> Ok, here's a sample line (remark: there are files in which there are
> more than 10 million lines):
>
> 8=FIX.4.4 9=10157 35=X 34=43344 49=FIXGatewayDerivatives 52=20110126-11:00:00.206 56=MD01A 10016=31418_DOLG11_181_70 75=20110126 268=83 279=0 269=2 278=213 55=DOLG11 48=BMFBR8618851 22=8 207=XBMF 270=1670 271=300 272=20110126 273=11:00:00 274=2 288=BM000735 289=BM000150 451=-4.162 6032=1 279=2 269=0 278=214 55=DOLG11 48=BMFBR8618851 22=8 207=XBMF 272=20110126 273=11:00:00 37=000010 288=BM000735 290=14 279=0 269=2 278=215 55=DOLG11 48=BMFBR8618851 22=8 207=XBMF 270=1670 271=200 272=20110126 273=11:00:00 274=3 288=BM000127 289=BM000150 451=-4.162 6032=2 279=2 269=0 278=216 55=DOLG11 48=BMFBR8618851 22=8 207=XBMF 272=20110126 273=11:00:00 37=000002 288=BM000127 290=9 279=2 269=1 278=217 55=DOLG11 48=BMFBR8618851 22=8 207=XBMF 272=20110126 273=11:00:00 37=000013 289=BM000150 290=12 279=0 269=2 278=218 55=DOLG11 48=BMFBR8618851 22=8 207=XBMF 270=1670 271=75 272=20110126 273=11:00:00 274=3 288=BM000227 289=BM000119 451=-4.162 6032=3 279=2 269=0 278=219 55=DOLG11 48=BMFBR8618851 !
22!
> =8 207=XBMF 272=20110126 273=11:00:
>
> My database regards to fx market and was provided by the brazilian
> futures and commodities exchange. (it's a intraday db)
>
> 268=83 indicates that 83 entries follow and the beginning of each
> entry is indicated by the code 279
>
>
> 2011/9/30 Nick Cox <[email protected]>:
>> Please don't show an abstraction. Show us a concrete example.
>>
>> It all depends on the length of the lines, which you don't give. If
>> this can be read into one or more string variables, Mata may not be
>> necessary.
>>
>> Nick
>>
>> Nick
>>
>> On Fri, Sep 30, 2011 at 3:59 PM, Pedro Nakashima
>> <[email protected]> wrote:
>>> Dear statalisters,
>>>
>>> My current database(it's in .txt) has a typical line in the form :
>>>
>>> cod1=time1 cod2=... cod3=n cod4=type1 cod5=quantity1
>>> cod6=price1 cod4=type2 cod5=quantity2 cod6=price2
>>>
>>> cod3=n (n=2 in this case, and n varies through lines) says that
>>> following this code, that are 2 records, each of them starts with
>>> cod4=type. For example:
>>> cod4=type1 cod5=quantity1 cod6=price1 and
>>> cod4=type2 cod5=quantity2 cod6=price2 are records.
>>>
>>> I want to generate a new database in the form:
>>>
>>> cod1 cod4 cod5 cod6
>>> time1 type1 quantity1 price1
>>> time1 type2 quantity2 price2
>>>
>>> I think it's only possible to do that with Mata given the lenghs of
>>> the lines, is it right?
>>>
>>> Can anyone give me a direction?
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/