<- See more new Stata features
Highlights
Asymmetric Laplace distribution (ALD) to model skewed outcomes
Flexible prior specifications
Bayesian quantile estimation via ALD
Simultaneous quantile estimation by using multiple equations
Include random effects at multiple levels of hierarchy
Model quantiles using nonlinear functions of predictors
Full support of Bayesian postestimation features
See more Bayesian analysis features
The bayesmh command now includes an asymmetric Laplace distribution (ALD) as a new likelihood function. You can use ALD to model nonnormal outcomes with pronounced skewness and kurtosis. You can also use it to fit Bayesian quantile regression models (Yu and Moyeed 2001). For a Bayesian univariate quantile regression, see the new bayes: qreg command. With bayesmh, you can fit Bayesian simultaneous, multilevel, and nonlinear quantile regression models. These features are part of StataNow™.
-> Random-effects quantile regression
-> Random-effects simultaneous quantile regression
-> Modeling outcomes using asymmetric Laplace distribution
Consider the mathscores dataset, which contains math scores of pupils in the third year (math3) and fifth year (math5) from different schools (school) in Inner London (Mortimore et al. 1988).
. webuse mathscores . describe Contains data from https://www.stata-press.com/data/r18/mathscores.dta Observations: 887 Variables: 3 9 May 2022 23:31
Variable Storage Display Value | ||
name type format label Variable label | ||
school float %9.0g School ID | ||
math3 float %9.0g Year 3 math score | ||
math5 float %9.0g Year 5 math score | ||
We are interested in fitting several Bayesian quantile regressions of math5 on math3 for different quantiles of math5. We also want to include a random intercept at the school level in the model to account for the between-school variability of the math scores.
Let's first fit a Bayesian random-intercept model for the median. We specify the asymlaplaceq({scale},0.5) likelihood option with a random scale parameter {scale} and a quantile value of 0.5. We use fairly uninformative priors for the coefficients (normal(0,10000)), scale (igamma(0.01,0.01)), and random-intercept variance (igamma(0.01,0.01)). We use blocking to improve the Markov chain Monte Carlo (MCMC) sampling and specify a random seed for reproducibility.
. bayesmh math5 math3 U[school], likelihood(asymlaplaceq({scale},0.5)) prior({math5:}, normal(0,10000)) block({math5:}) prior({scale}, igamma(0.01,0.01)) block({scale}) prior({var_U}, igamma(0.01,0.01)) rseed(19) Burn-in 2500 aaaaaaaaa1000aaaaaaaaa2000aaaaa done Simulation 10000 .........1000.........2000.........3000.........4000.........50 > 00.........6000.........7000.........8000.........9000.........10000 done Model summary
Likelihood: math5 ~ asymlaplaceq(xb_math5,{scale},0.5) Priors: {math5:math3 _cons} ~ normal(0,10000) (1) {U[school]} ~ normal(0,{var_U}) (1) {scale} ~ igamma(0.01,0.01) Hyperprior: {var_U} ~ igamma(0.01,0.01) |
(1) Parameters are elements of the linear form xb_math5. Bayesian asymm. Laplace (quantile) regression MCMC iterations = 12,500 Random-walk Metropolis–Hastings sampling Burn-in = 2,500 MCMC sample size = 10,000 Number of obs = 887 Acceptance rate = .3555 Efficiency: min = .01064 avg = .08481 Log marginal-likelihood max = .2189 |
Equal-tailed | ||
Mean Std. dev. MCSE Median [95% cred. interval] | ||
math5 | ||
math3 | .5771888 .0287285 .001293 .5769971 .5198219 .6337824 | |
_cons | 31.14144 .3642311 .035305 31.1465 30.4084 31.86673 | |
scale | 1.983294 .0683036 .00146 1.981383 1.853755 2.120894 | |
var_U | 3.679771 1.057536 .043052 3.561201 2.072422 6.134616 | |
The posterior mean estimate for the math3 coefficient is 0.58, and the variance of the random intercept is 3.68. If we were to fit a Bayesian two-level linear regression that models the conditional mean using the same prior distributions, we would find that its estimates are similar to the estimates obtained from the median regression. A question of interest might be whether the regression coefficients change across quantiles. We can use the estimates from the median regression as a baseline for comparing models for other quantiles.
Let's use bayesmh to fit two Bayesian quantile regressions simultaneously using a two-equation specification. For example, let's consider the 0.25 and 0.75 quantiles. We label the two regression equations as math5_q25 and math5_q75 to avoid duplication of the outcome variable. Similarly to the previous model, we use uninformative priors for coefficients and variance parameters. We also use a common scale parameter {scale} because the outcome variable is the same in the two equations. The burn-in length is increased to 5,000 to improve sampling adaptation.
. bayesmh (math5_q25:math5 math3 U1[school], likelihood(asymlaplaceq({scale},0.25))) (math5_q75:math5 math3 U2[school], likelihood(asymlaplaceq({scale},0.75))), prior({math5_q25:}, normal(0,10000)) block({math5_q25:}) prior({math5_q75:}, normal(0,10000)) block({math5_q75:}) prior({scale}, igamma(0.01,0.01)) prior({var_U1}, igamma(0.01,0.01)) block({var_U1}) prior({var_U2}, igamma(0.01,0.01)) block({var_U2}) burnin(5000) rseed(19) Burn-in 5000 aaaaaaaaa1000aaaaaaaaa2000aaaaaaaaa3000aaaaaaaaa4000aaaaaaaaa5000 > done Simulation 10000 .........1000.........2000.........3000.........4000.........50 > 00.........6000.........7000.........8000.........9000.........10000 done Model summary
Likelihood: math5 ~ asymlaplaceq(math5 math3 U1[school],{scale},0.25) math5 ~ asymlaplaceq(math5 math3 U2[school],{scale},0.75) Prior: {scale} ~ igamma(0.01,0.01) Hyperpriors: {math5_q25:math3 U1 _cons} ~ normal(0,10000) {math5_q75:math3 U2 _cons} ~ normal(0,10000) {var_U1 var_U2} ~ igamma(0.01,0.01) {U1[school]} ~ normal(0,{var_U1}) {U2[school]} ~ normal(0,{var_U2}) |
Bayesian asymm. Laplace (quantile) regression MCMC iterations = 15,000 Random-walk Metropolis–Hastings sampling Burn-in = 5,000 MCMC sample size = 10,000 Number of obs = 887 Acceptance rate = .339 Efficiency: min = .01429 avg = .06357 Log marginal-likelihood max = .2034 |
Equal-tailed | ||
Mean Std. dev. MCSE Median [95% cred. interval] | ||
math5_q25 | ||
math3 | .781409 .0331454 .001575 .7818769 .7174074 .844201 | |
U1 | 1 0 0 1 1 1 | |
_cons | 27.11279 .4114259 .03442 27.09728 26.33384 27.93378 | |
math5_q75 | ||
math3 | .4021812 .0285305 .0013 .4029019 .344931 .4566547 | |
U2 | 1 0 0 1 1 1 | |
_cons | 34.21954 .2754823 .018921 34.22476 33.66447 34.76017 | |
scale | 1.557948 .0373122 .000827 1.557052 1.486446 1.632176 | |
var_U1 | 7.793937 2.041174 .076138 7.560785 4.519766 12.38803 | |
var_U2 | 2.256973 .7545756 .036895 2.153597 1.128305 4.057043 | |
The posterior mean estimates of the math3 coefficient for the 0.25 and 0.75 quantiles are 0.78 and 0.40, respectively, which are above and below the earlier median estimate of 0.58. It appears that the regression slope decreases as the quantile value increases. In particular, we can conclude that math3 is a stronger predictor of the 25th percentile of math5 scores than of the 75th percentile of math5 scores.
Finally, we use ALD to examine the skewness (asymmetry) of the math5 variable. For this purpose, we use asymlaplaceq() with a random quantile parameter {tau}. A negative skewness is indicated by {tau} greater than 0.5; positive skewness is indicated by {tau} less than 0.5; and {tau} of 0.5 indicates no skewness, a perfect symmetry. We apply a flat uniform(0, 1) prior for {tau}. We specify a regression model with only a constant term for math5, which corresponds to the location parameter of ALD, and a scale parameter {scale} as in the previous examples.
. bayesmh math5, likelihood(asymlaplaceq({scale}, {tau})) prior({math5:}, normal(0,10000)) prior({scale}, igamma(0.01,0.01)) prior({tau}, uniform(0,1)) block({math5:} {scale}{tau}, split) initial({tau} 0.5) rseed(19) Burn-in ... Simulation ... Model summary
Likelihood: math5 ~ asymlaplaceq({math5:_cons},{scale},{tau}) Priors: {math5:_cons} ~ normal(0,10000) {scale} ~ igamma(0.01,0.01) {tau} ~ uniform(0,1) |
Bayesian asymm. Laplace (quantile) regression MCMC iterations = 12,500 Random-walk Metropolis–Hastings sampling Burn-in = 2,500 MCMC sample size = 10,000 Number of obs = 887 Acceptance rate = .424 Efficiency: min = .03815 avg = .05242 Log marginal-likelihood = -2846.2182 max = .07833 |
Equal-tailed | ||
Mean Std. dev. MCSE Median [95% cred. interval] | ||
math5 | ||
_cons | 37.00486 .045309 .001619 37.00222 36.9161 37.10699 | |
scale | 1.073886 .0655504 .003246 1.071063 .9511403 1.211991 | |
tau | .8602184 .0085491 .000438 .8601294 .843211 .8767714 | |
We see that the estimated 95% credible interval of [0.84, 0.88] suggests that {tau} is greater than 0.5, indicating a strong negative skewness. A more formal test for negative skewness can be conducted by using the bayestest interval command.
. bayestest interval {tau}, lower(0.5) Interval tests MCMC sample size = 10,000 prob1 : {tau} > 0.5
Mean Std. dev. MCSE | ||
prob1 | 1 0.00000 0 | |
Indeed, the posterior probability of {tau} greater than 0.5 is 1.
Mortimore, P., P. Sammons, L. Stoll, D. Lewis, and R. Ecob. 1988. School Matters: The Junior Years. Wells, Somerset, UK: Open Books.
Yu, K., and R. A. Moyeed. 2001. Bayesian quantile regression. Statistics & Probability Letters 54: 437–447.
Learn more about Stata's Bayesian analysis features.
Read more about Bayesian analysis in the Stata Bayesian Analysis Reference Manual; see [BAYES] bayesmh.
Also see Bayesian quantile regression.
View all the new features in Stata 18.