Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Problem with Disk Wait While Loading Subset of Observations
From
David Phillips <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
st: Problem with Disk Wait While Loading Subset of Observations
Date
Tue, 2 Jul 2013 05:50:27 +0000
Dear Statalist,
I'm using a Unix-based cluster of computers operating in Oracle Grid Engine with Stata-MP 12 to repeatedly (in parallel) load separate subsets of a single 8 GB file, by taking advantage of Stata syntax such as the following:
use in 1/10000 using `file'
I have found an unexpected phenomenon however. The jobs will stall in a 'disk wait' status and take hours to load the data. Interestingly however, if I remove the "in ... using" statement from the command (so that it's simply "use `file'"), the jobs take a perfectly reasonable 20 minutes or so to load the file. How could loading the full file be less taxing on these machines than loading a subset?
Thanks,
David
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/