Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Slowing process when running a program with multiple nested loops
From
David Kantor <[email protected]>
To
[email protected]
Subject
Re: st: Slowing process when running a program with multiple nested loops
Date
Mon, 14 Jan 2013 15:52:11 -0500
Hi,
Your code looks fairly straightforward, after looking at it for a minute.
At first, it looked cryptic, but once I understood it, I realized, I
would code very similarly.
There are a few unused macros, but that's irrelevant.
We don't see the code for sma, dma, trb, or cbo. Do these get
progressively complicated?
Is it possible that there is sudden jump in slowness when you switch
from sma to dma, or to trb or cbo?
Or is it gradual through all the iterations?
(TBut you did say that they do about the same amount of calculation)
More importantly, do they alter the data? Do they alter (-save-) the data file?
These latter points may be most relevant.
The important question is, after one iteration, can the next one run
without reloading (-use-ing) the data?
If not, can you rework your code (in sma, dma, trb, and cbo) to make
it so? (That is, have them not drop or add records. If they generate
or replace values of variables, have those be in designated variables
that can be reset easily. The idea being that if the dataset changes
in a significant way, then you want to be able to bring it back to
its pre-iteration state easily -- using -drop- or -replace ... = .-.
The last thing you should have to do is to reload the data for each
iteration. Reloading the data may be 1000 times slower than
continuing with the same data. (I don't have any real statistics on
that factor, but 1000 is not unreasonable.)
If you can arrange it so that you don't need to reload on each
iteration (or if it is already coded that way), then you can you
move the -use- command to the top -- before the first foreach?
Note that the repeated reloading will cause slowness, but may not
exactly explain why it gets progressively slower. But that may be an
operating-system issue. (It may be that after the first -use-, the
file is in cache, enabling some fast loads; later it is knocked out of cache.)
One other point is that it is not always good to -set mem- to a high
value. It should be high enough to get the job done, plus maybe a
little margin of safety. Otherwise, you are grabbing space that might
better left for the operating system to make good use of (such as for
cacheing files) and to run everything (including your task) smoother
and faster.
HTH
--David
At 02:52 PM 1/14/2013, you wrote:
Thank you for your response.
I did mean to say that there are 7 nested loops, because there are 7
parameters that can change values, and I do not know of another way to
have this done.
So the code is as followed:
** Initialization
clear all
set mem 100m
set more off, perm
set autotabgraphs on, perm
graph drop _all
cd "C:\Users\Trades\Stata"
sysdir set PERSONAL "C:\Users\Trades\Stata\Ado"
** Setting parameters
global freq "1m"
global fcrc "EUR USD GBP AUD NZD NZD EUR EUR AUD" // foreign currencies
global bcrc "USD JPY USD USD USD JPY JPY GBP JPY" // base currencies
global startdate = mdy(1,1,1994)
global enddate = mdy(12,31,2010)
global subperiod "2002jan01 2008sep01" // specify subperiods
local smam "2 5 10 15 20 25 50 100 150 200 250" // parameter m
for sma method
local smab "0 0.0005 0.001 0.005 0.01 0.05" // parameter b for sma method
local smad "2 3 4 5" // parameter d for sma method
local smac "5 10 25" // parameter c for sma method
local sman "0"
local smak "0"
local dmam "2 5 10 15 20 25 50 100 150 200 250" // parameter m
for dma method
local dman "2 5 10 15 20 25 50 100 150 200" // parameter n for dma method
local dmab "0 0.0005 0.001 0.005 0.01 0.05" // parameter b for dma method
local dmad "2 3 4 5" // parameter d for dma method
local dmac "5 10 25" // parameter c for dma method
local dmak "1000"
local trbn "5 10 15 20 25 50 100" // parameter n for trb method
local trbb "0.0005 0.001 0.005 0.01 0.025 0.05" // parameter b
for trb method
local trbd "2 3 4 5" // parameter d for trb method
local trbc "1 5 10 25" // parameter c for trb method
local trbm "1000"
local trbk "0"
local cbon "5 10 15 20 25 50 100 200" // parameter n for cbo method
local cbok "0.001 0.005 0.01 0.05 0.1" // parameter k for cbo method
local cbob "0.0005 0.001 0.005 0.01 0.05" // parameter b for cbo method
local cbod "0 1 2" // parameter d for cbo method
local cboc "1 5 10 25" // parameter c for cbo method
local cbom "1000"
** Loops to go through all methods and all parameters
foreach med in sma dma trb cbo { // loop through all the rules
foreach m of local `med'm { // all the m values
foreach n of local `med'n { // all the n values
foreach k of local `med'k { // all the k values
foreach b of local `med'b { // all the b values
foreach d of local `med'd { // all the d values
foreach c of local `med'c { // all the c values
clear
use data
`med', datevar(date) m(`m') n(`n') k(`k') b(`b') d(`d') c(`c')
}
}
}
}
}
}
}
`med' is calling one of the four ado files that I wrote: sma, dma,
trb, and cbo. It basically calculates profits based on the rule and
the parameters fed to the program, so I think each iteration does just
about the same amount of calculation.
The next part (which I haven't written) keeps track of results from
some methods and parameters that satisfy certain conditions. In my
opinion, this would be a minor thing if I can get the current code to
run in a reasonable amount of time.
Any help explaining why the program slows down so significantly after
a couple of hundreds of iterations will be much appreciated.
Thank you.
I think that most of would agree that we would need to see your code
to be able to say what the problem is. Meanwhile, did you mean that
the loops are nested to a depth of 7? That's unusually deep.
Just generally speaking, with loops, there are often actions that are
placed inside that don't need to be there; they can be moved "up" or
"out" (sometimes requiring a bit of modification) so as to not be done
multiple times unnecessarily. From what you describe, it seems that
the work done in each iteration is accumulating; each iteration does a
bit more work than the previous. There may be some unnecessary
repetition as described above. But it also seems that there is
something that grows and gets dragged along with each iteration --
again possibly unnecessarily. This is analogous to a cumulative song,
such as "The Twelve Days of Christmas"; the 12th verse is much longer
than the first.
On the other hand, does the true complexity of the task grow with each
iteration? Do you expect the 300th iteration to naturally be more
complex to perform than the first?
Show us your code if you want more help.
HTH
--David
--
Ly Tran
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/