Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: looping over files -- speed and Stata/MP

From	Nick Cox <[email protected]>
To	[email protected]
Subject	Re: st: looping over files -- speed and Stata/MP
Date	Wed, 16 Mar 2011 15:13:41 +0000

-fs- from SSC automates the production of a list of files. It is just
a wrapper for a standard Stata extended macro function but it would
obviate the need for a structure based on holding a file open while
doing lots of other things. At the same time, it is difficult to know
how much difference to timings that would make beyond reducing your
use of the OS.

Nick

On Wed, Mar 16, 2011 at 2:48 PM, Dimitri Szerman <[email protected]> wrote:

> In constructing a data set, I have to loop over hundreds of thousands
> of files. Simply put, this is what I do:
>
> ! dir "mydir" /a-d /b > filelist.txt         // list of files to be imported
> file open LIST using "filelist.txt", read
> file read LIST line
> while r(eof)==0 {
>
>     (a bunch of Stata commands)
>
> save mydir2\\`line', replace
> file read LIST line
> }
> file close LIST
>
>
> (In fact, I run a loop like this twice (first to import csv into dta;
> another to work (clean) the dta files). As it stands now, my code
> takes around 12 hours to run. My question is: will Stata/MP make it
> run faster? (For those familiar with Matlab, I guess this boils down
> to: does Stata/MP have something along the lines of "parfor", i.e., a
> "parallel-for" command?) More broadly, can anyone think of a way of
> speeding this up?
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: looping over files -- speed and Stata/MP
  - From: Austin Nichols <[email protected]>

References:
- st: looping over files -- speed and Stata/MP
  - From: Dimitri Szerman <[email protected]>

Prev by Date: Re: st: How to deal with missing standard error for ivprobit
Next by Date: st: RE: RE: RE: Multiple histograms in one panel with pweights.
Previous by thread: Re: st: looping over files -- speed and Stata/MP
Next by thread: Re: st: looping over files -- speed and Stata/MP
Index(es):
- Date
- Thread