Establishing convergence of Markov chain Monte Carlo (MCMC) is one of the most important steps of Bayesian analysis. Multiple chains are often used to check MCMC convergence. The Gelman–Rubin convergence diagnostic provides a numerical convergence summary based on multiple chains. Stata's new command bayesstats grubin computes this diagnostic for all model parameters. Also, when you simulate multiple chains using bayes: or bayesmh, these commands automatically report the maximum of the convergence diagnostics across all model parameters.
After using bayes, nchains(): or bayesmh, nchains() to simulate multiple chains, look at the maximum Gelman–Rubin diagnostic reported by the commands in the header. When it suggests nonconvergence, you can look at the individual diagnostics to discover which parameters have convergence problems. Type
. bayesstats grubin
Sometimes nonconvergence arises because you did not run enough simulations. Other times, it arises because of inherent problems with the model's specification. If you run more simulations without convergence and the same parameters keep being identified as problematic, you need to think about respecifying your model.
Consider the diabetes data Efron et al. (2004) on 442 diabetes patients that record a measure of disease progression as an outcome y and 10 baseline covariates: age, sex, body mass index, mean arterial pressure, and six blood serum measurements. We consider the version of these data in which the covariates are standardized. This dataset is commonly used to demonstrate various model-building techniques, including Bayesian lasso. Here we use it to demonstrate the bayesstats grubin command.
We use bayes: regress to fit the Bayesian linear regression of y on all variables in the dataset and use option nchains(3) to generate three chains.
. bayes, nchains(3) rseed(16): regress y age sex bmi map tc ldl hdl tch ltg glu Chain 1 Burn-in ... Simulation ... Chain 2 Burn-in ... Simulation ... Chain 3 Burn-in ... Simulation ... Model summary
Equal-tailed | ||
Mean Std. Dev. MCSE Median [95% Cred. Interval] | ||
y | ||
age | 81.50951 87.57016 4.06945 71.7336 -50.84804 234.4335 | |
sex | -167.0296 71.9136 6.76093 -167.887 -289.098 -42.73281 | |
bmi | 352.3836 99.94654 6.13795 360.123 193.378 505.5432 | |
map | 286.0075 78.16619 5.23075 275.4511 174.2242 426.8137 | |
tc | -273.1433 164.7453 14.618 -295.8386 -501.8144 -16.60783 | |
ld1 | 165.232 178.513 8.84052 231.598 -125.8265 352.7025 | |
hd1 | -94.36373 114.5865 7.16308 -106.0647 -263.5826 99.65676 | |
tch | 109.925 162.0595 13.0143 134.8212 -155.8764 332.5971 | |
ltg | 483.243 102.1253 6.00414 495.4546 307.5461 627.0424 | |
glu | 42.76467 117.208 6.29098 65.24344 -146.8923 198.615 | |
_cons | 152.4957 2.780968 .103842 152.4606 147.3315 157.9954 | |
sigma2 | 3110.56 221.9696 3.79219 3100.067 2707.435 3566.951 | |
The maximum Gelman–Rubin diagnostic across all model parameters is labeled as Max Gelman–Rubin Rc in the header and is 4.543. Gelman and Rubin (1992) and Brooks and Gelman (1998) suggest that diagnostic Rc values greater than 1.2 for any of the model parameters should indicate nonconvergence. In practice, a more stringent rule of Rc < 1.1 is often used to declare convergence.
In our example, the maximum Rc of 4.543 is larger than 1.1, so our MCMC did not converge. The presence of the note about high autocorrelation is also concerning, although it may not necessarily indicate nonconvergence.
We can explore Gelman–Rubin diagnostics for all parameters to identify the problematic ones. We use bayesstats grubin for that. We additionally sort the results from largest to smallest to find the most problematic parameters.
. bayesstats grubin, sort Gelman-Rubin convergence diagnostic Number of chains = 3 MCMC size, per chain = 10,000 Max Gelman-Rubin Rc = 4.542823
Rc | ||||
y | ||||
ldl | 4.542823 | |||
tch | 3.376646 | |||
tc | 3.184339 | |||
glu | 3.089546 | |||
bmi | 2.543104 | |||
hdl | 2.447089 | |||
map | 2.421061 | |||
age | 2.40928 | |||
ltg | 2.389624 | |||
sex | 1.680013 | |||
_cons | 1.082658 | |||
sigma2 | 1.023543 | |||
The coefficient for ldl has the maximum Rc value. All parameters except the last two exceed 1.1. Our model certainly has convergence issues.
Let's look at graphical diagnostics for the ldl coefficient.
. bayesgraph diagnostics {y:ldl}
There is a clear separation in the trace plots of the chains, and the distributions of all three chains are different.
So what do we do now? The answer is model- and data-specific. Sometimes we may simply need to increase our burn-in period, option burnin(). Other times, especially in the presence of highly correlated model parameters, we may need to improve efficiency of our sampling procedure by blocking parameters or using more efficient samplers; see Specifying MCMC sampling procedure. When the data provide little information about some of the model parameters, we may need to use more informative priors or entirely modify our model. See Convergence of MCMC for other suggestions.
We provide the solution for our model in the next section.
In the above Example of nonconvergence, we fit a Bayesian linear regression to the diabetes data and encountered problems with MCMC convergence using our default model specification.
From the bayes: regress output above, there appears to be a large variation in the magnitudes of the estimated coefficients. We may try placing parameters with similar magnitudes in separate blocks to improve simulation efficiency.
Also, many variables in the diabetes dataset are correlated, and some are highly correlated. This induces high correlations between some of the model parameters. (For instance, you can use bayesgraph matrix to explore pairwise correlations between model parameters.) The default adaptive Metropolis–Hastings used by bayes: and bayesmh may suffer from low efficiency in the presence of many highly correlated model parameters. In our example, the default priors used by bayes: regress—normal for coefficients and inverse gamma for the variance—allow us to use a more efficient Gibbs sampling.
. bayes, nchains(3) rseed(16) gibbs: regress y age sex bmi map tc ldl hdl tch ltg glu Chain 1 Burn-in ... Simulation ... Chain 2 Burn-in ... Simulation ... Chain 3 Burn-in ... Simulation ... Model summary
Equal-tailed | ||
Mean Std. Dev. MCSE Median [95% Cred. Interval] | ||
y | ||
age | 13.27542 51.20421 .295628 13.62079 -86.6546 112.654 | |
sex | -163.025 51.94371 .299897 -163.0511 -265.2759 -62.49451 | |
bmi | 428.655 54.8511 .325351 428.6502 321.9517 536.9177 | |
map | 269.312 53.95068 .311484 269.9104 162.9158 375.273 | |
tc | -33.02954 75.55542 .43803 -32.85623 -182.0179 113.7366 | |
ld1 | -72.58636 71.89936 .415111 -72.42117 -213.1838 68.32667 | |
hd1 | -185.885 65.39935 .377583 -185.9721 -313.8126 -57.03217 | |
tch | 121.3566 74.05157 .427537 121.5397 -24.13796 267.8116 | |
ltg | 370.8014 61.66557 .357777 371.1397 248.9354 492.0922 | |
glu | 103.5797 54.67812 .317191 103.3001 -3.523386 211.1194 | |
_cons | 152.0611 2.620844 .015131 152.0733 146.9208 157.1492 | |
sigma2 | 3019.896 210.3033 1.27697 3010.666 2634.993 3462.634 | |
The maximum Gelman–Rubin diagnostic is now 1 (or very close to one) and is less than 1.1, which indicates that no convergence issues are detected.
We can look at the graphical diagnostics for the ldl coefficient again.
. bayesgraph diagnostics {y:ldl}
There is a perfect overlap between the trace plots and the distributions of the three chains. The autocorrelation is essentially nonexistent, as is expected with Gibbs sampling, in all three chains.
Brooks, S. P., and A. Gelman. 1998. General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics 7: 434–455.
Efron, B., T. Hastie, I. Johnstone, and R. Tibshirani. 2004. Least angle regression. The Annals of Statistics 32: 407–499.
Gelman, A., and D. B. Rubin. 1992. Inference from iterative simulation using multiple sequences. Statistical Science 7: 457–472.
Learn more about multiple chains, Bayesian predictions, and Bayesian lasso.
Learn more about Stata's Bayesian analysis features.
Read more about Gelman–Rubin convergence diagnostic and Bayesian analysis in the Stata Bayesian Analysis Reference Manual.