Title | Kind of machine to run Stata | |
Author | Mark Esman |
Many users often ask, “What kind of machine should I purchase to make Stata run most effectively?” That is a very general question, and the answer really depends on many different things.
If it is within your budget to purchase a dual-core, multicore, or multiprocessor machine, Stata/MP can take advantage of these computer systems and allow “threads” of computations to be split across multiple processors. This can dramatically increase the speed of many Stata commands. All of the modern multicore processors on the market today are of the 64-bit variety and will allow Stata to take advantage of physical memory over 2 gigabytes allowing very large datasets to be loaded into memory. Click here for more information on compatible hardware architectures.
Stata prefers to load the entire dataset that it is using into physical RAM. This is handled by the operating system, and some OS’s are better at memory management than others. If the operating system cannot allocate enough memory to load the dataset into contiguous blocks of physical RAM, it may swap some of this memory space to the hard disk. This will slow Stata's operations down tremendously, so it is important to have enough memory installed on the machine to allow the entire dataset and any operating system overhead resources to be allocated to physical RAM. The type of processor and OS can affect memory allocation, depending on whether the operating system is 32 or 64-bit, but this can be a subject unto itself. We recommend that, if you are currently using or plan on using datasets in the neighborhood of 1 gigabyte or larger, you consider implementing a 64-bit processor and operating system. There are Windows versions, Mac versions, and several Unix distributions supporting 64-bit processors, and there are 64-bit versions of Stata for Windows, Mac, and Unix that can overcome the theoretical 2-GB memory limitation of 32-bit computers.
Now on to CPU clock speeds—Stata will show a near linear performance change with relative CPU clock differences. There will be some differences depending on the type of analysis being performed, the size of the processor’s on-chip cache, etc., but these are relatively minor overall. Many of these differences also depend on how well the operating system handles system calls and polling for multitasking events.
Stata implements many of its commands in small ancillary program files, known as ado-files, which are read from the hard disk and require file I/O resources. Many commands often write temporary copies of the dataset to the hard disk as well, and slow disks or low-performance file I/O can affect Stata's performance. Reading and writing files to the storage devices can make a difference, especially if these files are large or have to be accessed often.
In conclusion, there really is no "ideal" system on which to run Stata. It really depends on your budget, the type of analysis being performed, OS being used, and so on. General guidelines include the following:
To summarize, use the operating system that you feel most comfortable with. If you are not getting the most out of the OS, you probably will not get the most out of Stata, either.