Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Wesley Burnett <burnettwesley@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | st: Panel data, statsby, dynamic forecasting |
Date | Wed, 11 Sep 2013 13:48:13 -0400 |
I'm attempting to run a series of arima regressions within each region for a panel data set in long form. Following the arima regressions I'm attempting to do dynamic predictions so that I can iteratively forecast out of sample. Basically, the algorithm does a bunch of individual time series regressions and predictions. I have written a "foreach" loop with the "statsby" command to run the arima regressions across each region. The looping algorithm works; i.e., it runs the separate regressions and then performs the predictions, but the predictions are not calculated out of sample--it only performs the within sample predictions. If I estimate the arima regression and prediction within a single region (i.e., omitting the rest of the panel) then the code works fine and the entire out-of-sample predictions are calculated. I think the problem may be related to how the time series command is automated in Stata. That is, if one uses a single region then the "tsset year" command is invoked, where year denotes my time observations which are annual. If one uses the "foreach" and "statsby" algorithm on a panel data set then the "tsset regions year" command must be invoked; otherwise, if you use the "tsset year" command on the panel then Stata provides the error code: "repeated time values in sample." I would appreciate any feedback on how to fix this problem. The code is here, where the dynamic prediction is given a value of 2012 to tell Stata when to start the out-of-sample predictions. The term "lprod" designates a variable within my data set that is the log of production examined across time. *Invoke the time series command for region (fips) and time observation (year) tsset fips year *Add 24 years of annual time observations to the end of each region for the out-of-sample predictions tsappend, add(24) *foreach algorithm levelsof fips, local(fipsid) foreach c of local fipsid { local rr: label region `c' quietly arima lprod if fips==`c', arima(1,1,0) quietly predict p_prod_`c' if fips==`c', dyn(2012) y quietly predict fev_`c' if fips==`c', mse g upper_`c' = p_prod_`c' + 1.96*sqrt(fev_`c') g lower_`c' = p_prod_`c' - 1.96*sqrt(fev_`c') } * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/