Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | wgould@stata.com (William Gould, StataCorp LP) |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: RE: Will Stata/MP speed up running multiple dofiles in batch mode? |
Date | Thu, 20 Jan 2011 12:03:30 -0600 |
Alex Eapen <alex.eapen@sydney.edu.au> wonders about the efficiency of running multiple copies of Stata simultaneously. He writes, > I currently have Stata/SE 10.1 for Mac. I would like to run several > do-files at once (rather than sequentially). I see that this can be > done by running Stata in batch mode from, say, the Terminal > application in Mac OS (see > http://www.stata.com/support/faqs/unix/batch.html). That is, at the > prompt in the Terminal application in a Mac OS, I type: > > $ statase -b do bigjob1.do & > > followed by.... > > $ statase -b do bigjob2.do & > > and then... > > $ statase -b do bigjob3.do & > > And all three dofiles run simultaneously. Alex is issuing commands to the underlying Unix operating system of his Mac. The dollar signs I've inserted in front of what Alex types are the Unix prompts. He's typing those commands into Unix. The ampersand at the end of the command tells Unix that it is to run the command in the background rather than waiting until the command completes before issuing another prompt. There are other ways Alex could have run three Stata jobs simultaneously on his Mac. On other operating systems, there are ways one could run simultaneous Statas. The how doesn't matter, Alex's questions and my answers generalize across methods and operating systems. Alex asks, > 1. My first question is whether there is a limit on how many do files > can be run simultaneously in this manner? Only that imposed by your operating system. For all practical purposes, the answer is no, there is no limit. Next, Alex notices that running jobs simultaneously results in each job taking longer to complete, > 2. I compared the time it takes to execute a single dofile alone to > when it is run simultaneously with the other two. (I ensured all > three dofiles contain exactly the same code, so any difference in > time-to-run is not because of different commands in the m). When > the three dofiles are running simultaneously, possibly because > stata resources on my computer are being stretched across three > dofiles, each one takes longer to complete. First, let's think about this without getting into details of number of cores, etc. Let's say Alex runs the three jobs sequentially. Let s1 be the time for running the first job, s2 the time for the second, and s3 the time for the third. The total time Alex must wait for all three jobs to complete is then S = s1 + s2 + s3 Now say Alex runs the three jobs simultaneously. Let p1 be the time for running the first job, p2 the time for the second, and p3 the time for running the third. The total time Alex must wait for all three jobs to complete is then P = max(p1, p2, p3) Note that P can be < S even if p1>s1, p2>s2, and p3>s3. That is the basis for the often stated claim that running simultaneous processes can result in more efficient use of the cpu resources. In fact, in the old time-sharing literature, the goal was to run enough simultaneous processes so that P==S. The computer resources were being used even more "efficiently" if P>S, but if P>>S, that was considered inefficient. When P>>S, the computer was said to be "thrashing", which is merely saying, that it was spending too much time switching between jobs rather than running the jobs themselves. Thus, the rule on running simultaneous jobs is: Do not to run too many of them simultaneously. If you do, overall performance will suffer and P will be greater than S. Do things right, and you can obtain P<S. Next, Alex asks > Will Stata/MP help in this situation? Will Stata/MP distribute the > execution of dofiles (or parts of them) across multiple cores and thus > reduce the time it takes for each to complete? First, Stata/MP has nothing explicitly to do with running multiple jobs; Stata/MP is about running single jobs more quickly on multiple cores. Even so, Alex asks a good question. The answers are 1. Yes, Stata/MP can reduce execution time if there are more cores than there are jobs. 2. No if there are fewer cores than there are jobs. 3. No if the number of jobs equals the number of cores. I will explain. Operating systems are smart when running multiple jobs in a multiple core environment: They assign one job to each core, and only after all the cores are assigned do jobs compete for resources. Thus, answer (1) is yes because there were unused cores laying around. As an aside, to obtain maximum performance, we would like the number of cores to be a multiple of the number of jobs, and to limit Stata/MP to using just that multiple. With 3 jobs and 6 cores, we could arrange things so that job 1 ran on two cores, job 2 ran on another two, and job 3 ran on yet another two. If we just let each Stata/MP spread itself out over all six, there will be competition for resources and the operating system will have to manage that. Answer (2) is obvious, or at least will be afater I explain answer (3). Answer (3) requires some explanation. Stata/MP may be the most efficient parallelized statlistical package available, but even it is not 100% efficient: Running on two cores will not quite halve run times. That's because any statistical or data-management problem has some parts that must be performed sequentialy, and other parts that can be performed in parallel. During the parts that have to be performed sequentially, the extra cores sit idle. Operating Systems doling out multiple cores tend to be nearly 100% efficient when running single-core processes. One process does not depend on the other, so if each is given its own core, it can just blast away. The cores are idle only when they are waiting for a shared resource such as an I/O line. Thus, Stata/MP cannot be 100% efficient, operating systems nearly are 100% efficient, and so, when the number of cores equals the number of jobs, letting the operating system handle the simultaneity is better than asking Stata/MP to do it. This result should hardly surprise you. Alex started by asking us about three jobs. What's a job? I can think of the three jobs as one job, job_c = job1.do + job2.do + job3.do that is, edit job_c.do, copy in job1.do, then add to the file job2.do, and finally add job3.do to make one combined do-file. Can Stata/MP run job_c.do faster than Stata/SE? Certainly. Now consider partitioning job_c.do into {job1.do, job2.do, job3.do}, each independent of the other. Being independent, each can be run in parallel with theoretical efficiency of 100%. Stata/MP cannot achieve 100% efficiency. Thus, using three seprate Stata/MP sessions must result in the run time being reduced. Could you use Stata/SE instead? Yes, if you had three or fewer cores, because in that case you can only use one core per job. If you had more cores, using Stata/MP to run the separate jobs would be quicker. By the way, see http://www.stata.com/statamp/ for more information about Stata/MP and efficiency. -- Bill wgould@stata.com * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/