Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: looping over files -- speed and Stata/MP
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: looping over files -- speed and Stata/MP
Date
Wed, 16 Mar 2011 15:13:41 +0000
-fs- from SSC automates the production of a list of files. It is just
a wrapper for a standard Stata extended macro function but it would
obviate the need for a structure based on holding a file open while
doing lots of other things. At the same time, it is difficult to know
how much difference to timings that would make beyond reducing your
use of the OS.
Nick
On Wed, Mar 16, 2011 at 2:48 PM, Dimitri Szerman <[email protected]> wrote:
> In constructing a data set, I have to loop over hundreds of thousands
> of files. Simply put, this is what I do:
>
> ! dir "mydir" /a-d /b > filelist.txt // list of files to be imported
> file open LIST using "filelist.txt", read
> file read LIST line
> while r(eof)==0 {
>
> (a bunch of Stata commands)
>
> save mydir2\\`line', replace
> file read LIST line
> }
> file close LIST
>
>
> (In fact, I run a loop like this twice (first to import csv into dta;
> another to work (clean) the dta files). As it stands now, my code
> takes around 12 hours to run. My question is: will Stata/MP make it
> run faster? (For those familiar with Matlab, I guess this boils down
> to: does Stata/MP have something along the lines of "parfor", i.e., a
> "parallel-for" command?) More broadly, can anyone think of a way of
> speeding this up?
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/