Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Christopher Baum <kit.baum@bc.edu> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | re:st: Speed with large panel datasets |
Date | Mon, 21 Mar 2011 16:48:43 -0400 |
<> Gordon said forval i=1/`npanel' { arima depvar indvar1 indvar2 ... if np=`i' & <some other condition> } This will run very slowly on a large panel dataset, regardless of the command executed, because it uses an if condition to pick out the obs belonging to a panel. It would run faster if you used an in condition that referenced the observations in each panel. If it is a balanced panel, it is easy to compute those mechanically; if it is not, it only takes a couple of commands to identify the start and end of each panel. This may still not beat the reshape approach you're using, but it has been documented previously that if conditions on a large panel run much more slowly because they have to consider whether each of, e.g., a million obs. belong to this panel, and you know exactly which do and which don't, and can specify that as an in condition. KIt Kit Baum | Boston College Economics and DIW Berlin | http://ideas.repec.org/e/pba1.html An Introduction to Stata Programming | http://www.stata-press.com/books/isp.html An Introduction to Modern Econometrics Using Stata | http://www.stata-press.com/books/imeus.html * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/