|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: Re: better match-management algorithm
As much as I enjoy Mata programming, I don't think this case
obviously calls for Mata. I think it should be feasible to do this
without explicit loops at all, using something like the third stanza
of the enclosed (the first two just make up some transactions-time
data for stock and options quotes:
clear
set obs 100
egen transtime = fill(1 2 4)
g stockprice = 10*uniform()+50
sort transtime
save stox, replace
clear
set obs 75
egen transtime = fill(1.25 3.75 5.2)
g optprice = 10*uniform()+45
sort transtime
save opts, replace
use stox
merge transtime using opts
sort transtime
g prevstock=cond(stockprice[_n-1]<.,stockprice[_n-1], ///
cond(stockprice[_n-2]<.,stockprice[_n-2],stockprice[_n-3])) ///
if optprice<.
l transtime optprice prevstock if optprice<.
I am not usually a fan of nested cond() calls, but in this case it
seems to work well. If you scale it up to 500,000 stock quotes and
375,000 options quotes, it does the job (without the list) in 4.88
seconds (Stata 10/MP2; it is using parallel computation heavily).
Whenever an explicit loop appears in the logic, one should think
twice or thrice about whether there is a better way. In Stata there
often is.
Kit Baum, Boston College Economics and DIW Berlin
http://ideas.repec.org/e/pba1.html
An Introduction to Modern Econometrics Using Stata:
http://www.stata-press.com/books/imeus.html
On Aug 4, 2007, at 2:33 AM, Tobias wrote:
I encountered the following problem in a finance research project: two
tables, one with option prices, the other with (underlying) stock
prices.
The task is to match the appropriate stock price to each option price
observation. My current solution works, but seems to be inefficient
due to
tremendous processing time (> 4h).
My current solution is the following (I refer to the following
numbers in
the code below):
1) Fetch number of observations from underlying table.
2) Fetch number of obs. from option table
3) Merge the underlying prices to the option prices (one-to-one merge)
4) Using two nested forvalues loops, I iterate over the option
observations
to find an appropriate underlying price again iterating over the
underlying
prices in the second forvalues loop. [The matching criteria are an
identical
ISIN number, identical trading_date, and that the trading time of the
subsequent underlying is bigger than the option trading time, i.e.
looking
for the most recent underlying price.]
Before writing down my code, I would have the following questions:
A) IS THERE A MORE EFFICIENT WAY TO CARRY OUT THE CONDITIONAL MATCHING
WITHOUT HAVING TO ITERATE OVER EACH AND EVERY OBSERVATION ?
B) IF NOT, IS IT POSSIBLE TO 'OUTSOURCE' THE TASK TO A MATA
PROGRAM, SUCH
THAT THE COMPILATION OF THE LOOP-CODE IS DONE ONCE INSTEAD OF A
MILLION
TIMES ?
I thought about the Mata possibility when I read in a presentation
by Kit
Baum:
"Your ado-files may perform some complicated tasks which involve many
invocations of the same commands. Stata's ado-file language is easy
to read
and write, but it is interpreted: Stata must evaluate each
statement and
translate it into machine code. The new Mata programming language
(help
mata) creates compiled code which can run much faster than ado-file
code."
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/