Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Pedro Nakashima <nakashimapedro@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Breaking huge lines and creating variables |
Date | Fri, 30 Sep 2011 17:08:35 -0300 |
Ok, here's a sample line (remark: there are files in which there are more than 10 million lines): 8=FIX.4.49=1015735=X34=4334449=FIXGatewayDerivatives52=20110126-11:00:00.20656=MD01A10016=31418_DOLG11_181_7075=20110126268=83279=0269=2278=21355=DOLG1148=BMFBR861885122=8207=XBMF270=1670271=300272=20110126273=11:00:00274=2288=BM000735289=BM000150451=-4.1626032=1279=2269=0278=21455=DOLG1148=BMFBR861885122=8207=XBMF272=20110126273=11:00:0037=000010288=BM000735290=14279=0269=2278=21555=DOLG1148=BMFBR861885122=8207=XBMF270=1670271=200272=20110126273=11:00:00274=3288=BM000127289=BM000150451=-4.1626032=2279=2269=0278=21655=DOLG1148=BMFBR861885122=8207=XBMF272=20110126273=11:00:0037=000002288=BM000127290=9279=2269=1278=21755=DOLG1148=BMFBR861885122=8207=XBMF272=20110126273=11:00:0037=000013289=BM000150290=12279=0269=2278=21855=DOLG1148=BMFBR861885122=8207=XBMF270=1670271=75272=20110126273=11:00:00274=3288=BM000227289=BM000119451=-4.1626032=3279=2269=0278=21955=DOLG1148=BMFBR861885122! =8207=XBMF272=20110126273=11:00: My database regards to fx market and was provided by the brazilian futures and commodities exchange. (it's a intraday db) 268=83 indicates that 83 entries follow and the beginning of each entry is indicated by the code 279 2011/9/30 Nick Cox <njcoxstata@gmail.com>: > Please don't show an abstraction. Show us a concrete example. > > It all depends on the length of the lines, which you don't give. If > this can be read into one or more string variables, Mata may not be > necessary. > > Nick > > Nick > > On Fri, Sep 30, 2011 at 3:59 PM, Pedro Nakashima > <nakashimapedro@gmail.com> wrote: >> Dear statalisters, >> >> My current database(it's in .txt) has a typical line in the form : >> >> cod1=time1 cod2=... cod3=n cod4=type1 cod5=quantity1 >> cod6=price1 cod4=type2 cod5=quantity2 cod6=price2 >> >> cod3=n (n=2 in this case, and n varies through lines) says that >> following this code, that are 2 records, each of them starts with >> cod4=type. For example: >> cod4=type1 cod5=quantity1 cod6=price1 and >> cod4=type2 cod5=quantity2 cod6=price2 are records. >> >> I want to generate a new database in the form: >> >> cod1 cod4 cod5 cod6 >> time1 type1 quantity1 price1 >> time1 type2 quantity2 price2 >> >> I think it's only possible to do that with Mata given the lenghs of >> the lines, is it right? >> >> Can anyone give me a direction? > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/