Home  /  Products  /  Stata 17  /  Bayesian dynamic forecasting
This page announced the new features in Stata 17. Please see our Stata 18 page for the new features in Stata 18.

Bayesian dynamic forecasting

Highlights

  • Graph dynamic forecasts
  • Save dynamic forecasts in current dataset
  • Specify prediction horizon
  • Compute posterior means or medians of forecasts
  • Compute posterior standard deviations
  • Compute posterior credible intervals

Dynamic forecasting is a common prediction tool after fitting multivariate time-series models, such as vector autoregressive (VAR) models. It predicts outcome values at a current time by using updated (predicted) outcome values at previous times. The new bayesfcast command computes Bayesian dynamic forecasts after fitting a Bayesian VAR model by using the bayes: var command.

Bayesian dynamic forecasts produce an entire sample of predicted outcome values at each time point instead of a single prediction as in classical analysis. This sample can be used to answer various modeling questions, such as how well the model predicts future observations without making the asymptotic normality assumption when estimating forecast uncertainty. This is particularly appealing for small datasets for which the asymptotic normality assumption may be suspect.

You can use bayesfcast compute to compute dynamic forecasts and save them in the current dataset, and you can graph them by using bayesfcast graph.


Let's see it work

We continue with the U.S. macrodata from Bayesian VAR models, which are quarterly data from the first quarter of 1954 to the fourth quarter of 2010. We would like to obtain Bayesian dynamic forecasts of inflation, output gap, and federal funds rate from a Bayesian VAR model for these variables.

Here is what the data look like.

. webuse usmacro
(Federal Reserve Economic Data - St. Louis Fed)

. tsset

Time variable: date, 1954q3 to 2010q4
        Delta: 1 quarter

. tsline inflation ogap fedfunds

We first fit a Bayesian VAR model with four lags. To check our forecast performance, we will hold our data after the first quarter of 2004. We also save MCMC results in bvarsim.dta.

The output from the bayes: var command is long, so we omit some of it below.

. bayes, rseed(17) saving(bvarsim): var inflation ogap fedfunds 
> if date < tq(2004q1), lags(1/4)

Burn-in ...
Simulation ...
(output omitted)

Bayesian vector autoregression                       MCMC iterations  =     12,500
Gibbs sampling                                       Burn-in          =      2,500
                                                     MCMC sample size =     10,000
Sample: 1956q3 thru 2003q4                           Number of obs    =        190
                                                     Acceptance rate  =          1
                                                     Efficiency:  min =      .9322
                                                                  avg =       .993
Log marginal-likelihood = -670.32584                              max =          1

Equal-tailed
Mean Std. dev. MCSE Median [95% cred. interval]
inflation
inflation
L1. 1.107465 .0422849 .000423 1.106848 1.02544 1.192476
L2. -.064825 .0417594 .000418 -.064536 -.1470882 .0176208
L3. -.0358872 .0290815 .000291 -.0359867 -.092745 .0210088
L4. -.0397985 .0215853 .000216 -.0397207 -.0821996 .002274
ogap
L1. .0646785 .0294384 .000305 .0644936 .0070243 .1229662
L2. .0071294 .0267595 .000268 .0072498 -.0444994 .058461
L3. -.002015 .0187035 .000192 -.0021291 -.038934 .0346847
L4. -.0088532 .0142951 .000141 -.0089083 -.0366927 .0193774
fedfunds
L1. .0770026 .027543 .000275 .076643 .0237776 .1315991
L2. -.0351476 .0243814 .000244 -.0351349 -.0831241 .0124089
L3. -.0151671 .0173423 .000173 -.0154901 -.0487873 .0193082
L4. -.0190271 .0134133 .000134 -.0191324 -.0456025 .0072003
_cons .1225433 .0832813 .000833 .1225758 -.0433392 .2853939
ogap
inflation
L1. -.068909 .0627925 .000628 -.0683572 -.1934463 .0524915
(output omitted)
_cons .3851112 .1261445 .001261 .3836084 .1334414 .6333448
fedfunds
inflation
L1. .0568126 .0719825 .00072 .0563617 -.0829406 .2008528
(output omitted)
_cons .1931161 .1433129 .001433 .1950842 -.0912408 .4717853
Sigma_1_1 .2873009 .0293728 .000297 .2853721 .2349716 .3493519
Sigma_2_1 .0281781 .0315254 .000315 .0276486 -.0345647 .0912571
Sigma_3_1 .1480748 .0372496 .000372 .1468518 .0777631 .2251876
Sigma_2_2 .6575456 .0671182 .000684 .6530136 .5395734 .8029292
Sigma_3_2 .2398338 .0559347 .000559 .238127 .1357633 .3561841
Sigma_3_3 .8371554 .0857785 .000858 .8298623 .6868505 1.024522
file bvarsim.dta saved.

See Bayesian VAR model for details about bayes: var and its output.

We use the new bayesfcast command to compute Bayesian dynamic forecasts. This command has two subcommands. bayesfcast compute computes the forecasts and saves them in the current dataset as new variables. And bayesfcast graph plots the forecasts.

Let's start with the simplest specification. We specify f_ as the prefix for the new variables and a random-number seed for reproducibility.

. bayesfcast compute f_, rseed(17)

. describe f_*

Variable Storage Display Value
name type format label Variable label
f_inflation double %10.0g Posterior mean forecast for
inflation
f_inflation_sd double %10.0g Posterior standard deviation of
forecast for inflation
f_inflation_lb double %10.0g 95% lower credible bound for
forecast for inflation
f_inflation_ub double %10.0g 95% upper credible bound for
forecast for inflation
f_ogap double %10.0g Posterior mean forecast for ogap
f_ogap_sd double %10.0g Posterior standard deviation of
forecast for ogap
f_ogap_lb double %10.0g 95% lower credible bound for
forecast for ogap
f_ogap_ub double %10.0g 95% upper credible bound for
forecast for ogap
f_fedfunds double %10.0g Posterior mean forecast for
fedfunds
f_fedfunds_sd double %10.0g Posterior standard deviation of
forecast for fedfunds
f_fedfunds_lb double %10.0g 95% lower credible bound for
forecast for fedfunds
f_fedfunds_ub double %10.0g 95% upper credible bound for
forecast for fedfunds

bayesirf compute creates new variables containing various summary statistics for forecasts. Unlike with classical analysis, a Bayesian forecast at a specific time corresponds to not just one value but a sample of MCMC values. These values are then summarized (as means or medians) to provide a single statistic. By default, the command computes posterior means, posterior standard deviations, and 95% equal-tailed credible intervals for the forecast for each outcome variable.

Let's plot the default forecasts and their 95% equal-tailed credible bands.

. bayesfcast graph f_inflation f_ogap f_fedfunds

By default, we obtain one-period forecasts. We later show how to create forecasts with more periods.

Instead of posterior mean forecasts and equal-tailed credible intervals, we can compute posterior median forecasts and highest posterior density (HPD) credible intervals. We compute and plot them below.

. bayesfcast compute f_, rseed(17) median hpd replace
. bayesfcast graph f_inflation f_ogap f_fedfunds

If we want to explore trends in forecasts, we need more time periods. For instance, below we specify 28 periods in the step() option of bayesfcast compute.

. bayesfcast compute f_, step(28) rseed(17) replace
. bayesfcast graph f_inflation f_ogap f_fedfunds, observed

We compare our forecasts with the observed values. The forecasts appear to do well in predicting outcomes up until 2007. After that, they perform poorly. For instance, the 95% credible intervals for the output gap do not contain the observed values for part of the time horizon.

After the analysis, we remove the MCMC simulation file saved by bayes: var.

. erase bvarsim.dta


Additional resources

Learn more in the Stata Bayesian Analysis Reference Manual.

See Bayesian VAR models.

Learn more about new Bayesian econometrics features.