Markov chain Monte Carlo (MCMC) is used for Bayesian inference. Has the MCMC converged? Has it fully explored the target posterior distribution? Or do you need more simulations? You need to answer these questions for proper inference. New option nchains() can help you with that.
Type
. bayes, nchains(4): regress y x1 x2
and four chains will be produced, in this case for Bayesian regression of y on x1 and x2. The chains will be compared and the maximum Gelman–Rubin convergence diagnostic reported. And the chains will be combined to produce a more accurate final result. The intent is that you check the reported diagnostic before interpreting the results.
nchains() may be used with the bayes: or bayesmh command.
For convergence diagnostic purposes, it is important that different chains have dispersed initial values for the model parameters. Default initial values are provided, but they may not be sufficient for all cases. You can use new options initall() and init#() to specify your own initial values for all chains or for specific chains. See Convergence diagnostics using multiple chains for details.
All existing Bayesian postestimation commands such as bayesstats, bayestest, and bayesgraph provide support for multiple chains via the new chains() and sepchains options.
Also see the new bayesstats grubin command, which computes Gelman–Rubin convergence diagnostics for all model parameters, and How to run multiple chains in parallel?
Consider the diabetes data (Efron et al. 2004) on 442 diabetes patients that record a measure of disease progression as an outcome y. To demonstrate multiple chains, we consider a small subset of variables of interest: age, sex, body mass index, and mean arterial pressure.
Multiple chains serve multiple purposes. More commonly, they are used to explore MCMC convergence. But they are also useful for obtaining more precise estimates of parameters. And you can run them in parallel to save time.
Let's fit a Bayesian linear regression and produce three multiple chains.
. bayes, nchains(3) rseed(16): regress y age sex bmi map Chain 1 Burn-in ... Simulation ... Chain 2 Burn-in ... Simulation ... Chain 3 Burn-in ... Simulation ... Model summary
Equal-tailed | ||
Mean Std. Dev. MCSE Median [95% Cred. Interval] | ||
y | ||
age | 64.63216 53.6199 1.48006 64.57528 -40.40117 170.1121 | |
sex | -56.67161 52.83794 1.25481 -55.63422 -158.4018 45.66125 | |
bmi | 591.5728 55.3663 1.49183 592.1313 483.4726 699.7989 | |
map | 345.2107 56.15044 1.45502 344.3014 233.006 455.7745 | |
_cons | 152.063 2.876962 .074766 152.1028 146.3381 157.6296 | |
sigma2 | 3722.503 259.0622 3.46912 3709.148 3245.06 4254.987 | |
We demonstrate the use of multiple chains to explore MCMC convergence in detail in Gelman–Rubin convergence diagnostic. Here we will just point out that the maximum Gelman–Rubin diagnostic, reported as Max Gelman-Rubin Rc in the header, is 1.002, which is less than 1.1 and thus does not indicate convergence problems for our model.
We can also explore convergence visually by looking at a trace plot, for instance, for the bmi coefficient.
. bayesgraph trace {y:bmi}
With multiple chains, bayesgraph automatically plots multiple chains on one graph. The trace plot does not indicate any convergence problems. All chains look similar and oscillate around similar point estimates.
We can explore more diagnostics.
. bayesgraph diagnostics {y:bmi}
The autocorrelation and distribution plots look similar for all chains. The autocorrelation dies off fairly quickly in all three chains.
In this demonstration, we are satisfied with our visual exploration of MCMC convergence. In actual analysis, we would want to perform the above steps for all model parameters and explore the convergence more formally.
After verifying MCMC convergence, as in Using multiple chains to explore convergence visually and Using multiple chains to explore convergence formally, we can perform inference.
In the presence of multiple chains, posterior summaries are computed by using all chains, three in our example:
. bayesstats summary Posterior summary statistics Number of chains = 3 MCMC sample size = 30,000
Equal-tailed | ||
Mean Std. Dev. MCSE Median [95% Cred. Interval] | ||
y | ||
age | 64.63216 53.6199 1.48006 64.57528 -40.40117 170.1121 | |
sex | -56.67161 52.83794 1.25481 -55.63422 -158.4018 45.66125 | |
bmi | 591.5728 55.3663 1.49183 592.1313 483.4726 699.7989 | |
map | 345.2107 56.15044 1.45502 344.3014 233.006 455.7745 | |
_cons | 152.063 2.876962 .074766 152.1028 146.3381 157.6296 | |
sigma2 | 3722.503 259.0622 3.46912 3709.148 3245.06 4254.987 | |
Using all chains provides more precise estimates of model parameters. For instance, we can compare the results above with the results from only the first chain:
. bayesstats summary, chains(1) Posterior summary statistics Number of chains = 1 Chain 1 MCMC sample size = 10,000
Equal-tailed | ||
Mean Std. Dev. MCSE Median [95% Cred. Interval] | ||
y | ||
age | 65.38379 54.08876 2.18726 65.78475 -42.80366 169.0617 | |
sex | -57.99493 54.04443 2.46374 -57.67259 -158.3685 47.19167 | |
bmi | 590.8162 55.57828 2.90659 590.5812 483.4726 700.9136 | |
map | 346.9589 56.02036 2.80642 344.5968 238.6808 460.3592 | |
_cons | 152.1605 2.880091 .136456 152.2867 146.5691 157.6759 | |
sigma2 | 3726.441 262.7137 6.34486 3720.178 3249.068 4262.679 | |
The posterior mean and standard deviations are similar, but the MCSEs are lower for the results using all three chains because there are more MCMC estimates and they are less correlated.
We can even compare the results from all chains.
. bayesstats summary, sepchains Posterior summary statistics Chain 1 MCMC sample size = 10,000
Equal-tailed | ||
Mean Std. Dev. MCSE Median [95% Cred. Interval] | ||
y | ||
age | 65.38379 54.08876 2.18726 65.78475 -42.80366 169.0617 | |
sex | -57.99493 54.04443 2.46374 -57.67259 -158.3685 47.19167 | |
bmi | 590.8162 55.57828 2.90659 590.5812 483.4726 700.9136 | |
map | 346.9589 56.02036 2.80642 344.5968 238.6808 460.3592 | |
_cons | 152.1605 2.880091 .136456 152.2867 146.5691 157.6759 | |
sigma2 | 3726.441 262.7137 6.34486 3720.178 3249.068 4262.679 | |
Equal-tailed | ||
Mean Std. Dev. MCSE Median [95% Cred. Interval] | ||
y | ||
age | 61.57452 52.64732 4.16571 60.93262 -41.98606 165.0094 | |
sex | -56.0638 53.11474 1.99601 -53.1213 -155.3794 45.02055 | |
bmi | 592.0209 55.81173 2.41218 591.3277 482.1642 699.0798 | |
map | 345.6178 57.0509 2.537 344.935 235.805 460.1346 | |
_cons | 152.1002 2.82684 .13415 152.0675 146.5058 157.6649 | |
sigma2 | 3720.897 257.873 5.96029 3706.499 3240.245 4253.801 | |
Equal-tailed | ||
Mean Std. Dev. MCSE Median [95% Cred. Interval] | ||
y | ||
age | 66.93817 53.90694 2.31714 67.29965 -35.94106 174.1906 | |
sex | -55.95611 51.28798 2.12267 -54.78232 -159.7941 45.54302 | |
bmi | 591.8815 54.69921 2.50606 593.381 484.8148 698.1792 | |
map | 343.0555 55.26911 2.28489 343.8618 224.2158 449.4169 | |
_cons | 151.9282 2.916109 .119936 151.953 146.1003 157.5059 | |
sigma2 | 3720.172 256.5298 5.75014 3703.189 3244.123 4247.216 | |
Multiple chains may be particularly useful for parameters with low sampling efficiencies.
Just like bayesstats summary, all other postestimation commands recognize the presence of multiple chains. For instance, suppose we would like to test whether the coefficient for bmi is larger than 600. We can use bayestest interval to compute the posterior probability that {y:bmi}>600.
. bayestest interval {y:bmi}, lower(600) Interval tests Number of chains = 3 MCMC sample size = 30,000 prob1 : {y:bmi} > 600
Mean Std. Dev. MCSE | ||
prob1 | .4436 0.49684 .0124317 | |
The estimated posterior probability is about 44%.
Efron, B., T. Hastie, I. Johnstone, and R. Tibshirani. 2004. Least angle regression. The Annals of Statistics 32: 407–499.
See New in Bayesian analysis for other new features in Bayesian analysis and, particularly, Gelman–Rubin convergence diagnostic. Also see all Bayesian features.
Learn about Bayesian lasso modeling.
Read more about Multiple chains and Bayesian analysis in the Stata Bayesian Analysis Reference Manual.