Hi Kit
Thanks for having a look at my problem. I hope you don't mind my taking
sections from your -rollreg- routine for use in -rollreg2-. I'm still in the
learning by imitation stage!
> You're creating a very large number of new variables stub_v, stub_se,
> stub_cons, ... and eventually you fill up memory doing so.
I'm not sure I follow this. I believe I am making a total of (k+1)*2 + 4 new
variables where k is the number of independent variables. Michael Blasnik
pointed out the problem caused by -tempvar- where I was creating a new
variable on each call to -predict-. But I don't see how I am making multiple
variable from stub_`v', etc.
> I have not seriously considered whether my -rollreg- needs the 'no
> gaps' limitation that is imposed by many Stata time series commands. I
> will try to look at that.
I can't speak for everyone, but for users of financial data such as prices,
volumes, and financial statement data, the lifting of such a restriction
would be very helpful. There a numerous cases where there exists 15+ years
of daily price data with only a handful of missing daily prices.
> But even if it was relaxed, my routine would
> run into the same problem as yours. Say that you have 1000 firms and 20
> years per firm; then it will try to create several thousand new
> variables, each with 20 obs.
I am clearly missing something here. Dropping -tempvar- seemed to fix at
least part of my problem. Again, I don't see where multiple copies of the
new variables are being created. To tried to understand this problem I
modified -rollreg2- to keep track of the number of variables in the dataset.
The number appears stable. Here is a copy of the routine (Sorry it is ugly
as I have spent all weekend debugging it). The key addition follows the call
to -predict-.
Thanks again for your help.
capture program drop rollreg2
program rollreg2 , byable(onecall)
version 8.2
syntax varlist(min=2), MOVE(integer) STUB(string)
qui generate rc = .
qui generate numvars = .
local by "`_byvars'"
tempvar gr
if _by() == 1 {
qui egen `gr' = group(`by')
else {
qui gen `gr' = 1
qui levels `gr', local(groups)
local k: word count `varlist'
local depvar: word 1 of `varlist'
qui forvalues i = 2/`k' {
local v: word `i' of `varlist'
confirm new variable `stub'_`v'
confirm new variable `stub'_se_`v'
generate `stub'_`v' = .
generate `stub'_se_`v' = .
local reglist "`reglist' `v'"
qui {
confirm new variable `stub'_cons
confirm new variable `stub'_se_cons
confirm new variable `stub'_r2
confirm new variable `stub'_RMSE
confirm new variable `stub'_sd_residual
confirm new variable `stub'_N
generate `stub'_cons = .
generate `stub'_se_cons = .
generate `stub'_r2 = .
generate `stub'_RMSE = .
generate `stub'_sd_residual = .
generate `stub'_N = .
local max = 0
local min = 1
qui count
local total = r(N)
//quietly {
foreach g of local groups {
qui count if `gr' == `g'
local max = r(N) + `max'
if `=`max' - `move' + 1' < 0 | `=`max' - `min'' < `move' - 1 {
local min = `max' + 1
forvalues i = `min'/`=`max' - `move' + 1' {
local j = `i' + `move' - 1
if `j' <= `total' {
local gvk = gvkey[`i']
display "gvkey = `gvk' i = `i' j = `j'"
capture regress `depvar' `reglist' in `i'/`j'
//regress `depvar' `reglist' in `i'/`j'
qui replace rc = _rc in `j'
if _rc != 2000 & _rc != 0 {
regress `depvar' `reglist' in `i'/`j'
if _rc == 0 & e(N) == `move' {
tempvar res
qui predict `res' if e(sample), res
qui summarize `res'
qui describe
qui replace numvars = r(k) in `j'
qui replace `stub'_sd_residual = `res' in `j'
qui replace `stub'_r2 = e(r2_a) in `j'
qui replace `stub'_RMSE = e(rmse) in `j'
qui replace `stub'_N = e(N) in `j'
qui replace `stub'_cons = _b[_cons] in `j'
qui replace `stub'_se_cons = _se[_cons] in `j'
forvalues l = 2/`k' {
local v: word `l' of `varlist'
qui replace `stub'_`v' = _b[`v'] in `j'
qui replace `stub'_se_`v' = _se[`v'] in `j'
drop `res'
local min = `max' + 1
