Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Problem with Disk Wait While Loading Subset of Observations

From	David Phillips <[email protected]>
To	"'[email protected]'" <[email protected]>
Subject	st: Problem with Disk Wait While Loading Subset of Observations
Date	Tue, 2 Jul 2013 05:50:27 +0000

Dear Statalist,

I'm using a Unix-based cluster of computers operating in Oracle Grid Engine with Stata-MP 12 to repeatedly (in parallel) load separate subsets of a single 8 GB file, by taking advantage of Stata syntax such as the following:

use in 1/10000 using `file'

I have found an unexpected phenomenon however. The jobs will stall in a 'disk wait' status and take hours to load the data. Interestingly however, if I remove the "in ... using" statement from the command (so that it's simply "use `file'"), the jobs take a perfectly reasonable 20 minutes or so to load the file. How could loading the full file be less taxing on these machines than loading a subset? 

Thanks,
David
 

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: Fwd: xtmelogit predict generates missing values
Next by Date: st: Flaggin Changes in Data
Previous by thread: st: Fwd: xtmelogit predict generates missing values
Next by thread: st: Flaggin Changes in Data
Index(es):
- Date
- Thread