Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: looping over files -- speed and Stata/MP


From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: looping over files -- speed and Stata/MP
Date   Wed, 16 Mar 2011 15:13:41 +0000

-fs- from SSC automates the production of a list of files. It is just
a wrapper for a standard Stata extended macro function but it would
obviate the need for a structure based on holding a file open while
doing lots of other things. At the same time, it is difficult to know
how much difference to timings that would make beyond reducing your
use of the OS.

Nick

On Wed, Mar 16, 2011 at 2:48 PM, Dimitri Szerman <[email protected]> wrote:

> In constructing a data set, I have to loop over hundreds of thousands
> of files. Simply put, this is what I do:
>
> ! dir "mydir" /a-d /b > filelist.txt         // list of files to be imported
> file open LIST using "filelist.txt", read
> file read LIST line
> while r(eof)==0 {
>
>     (a bunch of Stata commands)
>
> save mydir2\\`line', replace
> file read LIST line
> }
> file close LIST
>
>
> (In fact, I run a loop like this twice (first to import csv into dta;
> another to work (clean) the dta files). As it stands now, my code
> takes around 12 hours to run. My question is: will Stata/MP make it
> run faster? (For those familiar with Matlab, I guess this boils down
> to: does Stata/MP have something along the lines of "parfor", i.e., a
> "parallel-for" command?) More broadly, can anyone think of a way of
> speeding this up?
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index