I haven't tried to understand this, but small improvements
should be possible by using
-count- whenever you just want a count
_consistently_ using -summarize, meanonly- when it suffices
byte variables whenever they suffice
Nick
[email protected]
Tobias Pfaff
we are having a problem concerning an algorithm for data management (Stata
9.2). Our current solution has a forvalues loop iterating over every single
observation. So far we don't see how we could do it without a loop. The
current processing time is tremendous and we would be grateful about any
hints. It might be a bit complicated, and you'd have to step a bit into the
logic of our program.
We have got one dta-file with tick trading data of investment products. To
each product we would now like to assign an option price whose tick trading
data is in another dta-file. The algorithm should do the following:
1. STRIKE CONDITION: Calculate the strike difference between the product and
the option. Look for a minimum of the strike difference. If there is only
one minimum, assign this option price to the product and stop. Else,
continue.
2. MATURITY CONDITION: Among the options with minima in strike difference,
calculate maturity difference between product and option. Look for a minimum
of the maturity difference. If there is only one minimum, assign this option
price to the product and stop. Else, continue.
3. TRADING TIME CONDITION: Among the options with minima in maturity
difference (and consequently in strike difference), calculate trading time
difference between product and option. Look for the minimum of the trading
time difference. Assign this option price to the product and stop.
=> For all three conditions, consider only options that have been traded
less than one hour ago.
Our solution does the following:
A) Merge option prices to products over ISIN and trading time.
B) Sort chronologically.
C) Do a forvalues loop over all observations. For each product, save ISIN,
trading time, maturity, and strike in locals and continue with the
algorithm.
HOW COULD WE AVOID THE LOOP OVER ALL OBSERVATIONS??
// A)
sort product_underlying_isin product_trading_date product_trading_time
merge product_underlying_isin product_trading_date product_trading_time ///
using temp_options.dta
// B) Sorting is important!
sort product_underlying_isin product_trading_date product_trading_time
// C) Choose matching options with criterion cascade
gen double product_matched_implied_vola = .
local obs = _N
forvalues num = 1/`obs' {
if (_merge[`num'] == 1) {
// Save values for current product
local isin = product_underlying_isin[`num']
local product_time = temp_trading_time[`num']
local product_cap = product_cap[`num']
local product_maturity = product_maturity[`num']
// Calculate absolute strike difference
quietly replace temp_strikediff = abs(option_exercise_price
///
- `product_cap') if (_merge == 2 & product_underlying_isin
///
== "`isin'" & _n < `num' & abs(temp_trading_time - ///
`product_time') < 3600000)
// Break execution of loop if there is no previous option
// for a product
quietly sum temp_strikediff
if (r(N) == 0) continue
// CRITERION 1: Look for minimum of strike difference
// and flag resp. observations
quietly sum temp_strikediff
local min = r(min)
quietly gen temp_flag_min_strikediff = 1 if (temp_strikediff
== `min')
// If only one option observation is left break execution of
// current loop iteration and assign option vola
quietly sum temp_flag_min_strikediff
if (r(N) == 1) {
quietly sum option_implied_volatility if
temp_flag_min_strikediff ///
== 1, meanonly
quietly replace product_matched_implied_vola =
r(mean) in `num'
drop temp_flag*
continue
}
// Calculate absolute maturity difference where criterion 1
is met
quietly replace temp_maturitydiff = abs(option_maturity -
///
`product_maturity') if (_merge == 2 &
product_underlying_isin ///
== "`isin'" & _n < `num' & abs(temp_trading_time - ///
`product_time') < 3600000 & temp_flag_min_strikediff == 1)
// CRITERION 2: Look for minimum of maturity difference
// and flag resp. observations (fulfilling also crit. 1)
quietly sum temp_maturitydiff
local min = r(min)
quietly gen temp_flag_min_maturitydiff = 1 if
(temp_flag_min_strikediff ///
== 1 & temp_maturitydiff == `min')
// If only one option observation is left break execution of
current
// loop iteration and assign option vola
quietly sum temp_flag_min_maturitydiff#
if (r(N) == 1) {
quietly sum option_implied_volatility if
temp_flag_min_maturitydiff ///
== 1, meanonly
quietly replace product_matched_implied_vola =
r(mean) in `num'
drop temp_flag*
continue
}
// Calculate absolute time difference where criterion 2 is
met
quietly replace temp_timediff = abs(temp_trading_time - ///
`product_time') if (_merge == 2 & product_underlying_isin ==
///
"`isin'" & _n < `num' & abs(temp_trading_time -
`product_time') < 3600000 & temp_flag_min_maturitydiff == 1)
// CRITERION 3: Look for minimum of time difference and
// flag resp. observations (fulfilling also crit. 1 and 2)
quietly sum temp_timediff
local min = r(min)
quietly sum option_implied_volatility if temp_timediff ==
`min', meanonly
quietly replace product_matched_implied_vola = r(mean) in
`num'
drop temp_flag*
}
}
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/