Bayesian quantile regression

Bayesian quantile regression StataNow

StataNow

Order

<- See more new Stata features

Highlights

Bayesian estimates of quantile regression coefficients
Flexible prior specifications
Comprehensive posterior inference
Model-based “standard errors”
Full support of Bayesian postestimation features
See more Bayesian analysis features

The new bayes: qreg command fits Bayesian quantile regression. The Bayesian framework provides full posterior distributions for quantile regression coefficients that offer comprehensive inference, including model-based “standard errors”. All standard Bayesian features, such as hypothesis testing and prediction, are supported. This command is part of StataNow™.

Quantile regression models the conditional quantiles of an outcome as a linear combination of predictors. Traditional quantile regression relies on a specific set of loss functions and linear programming for estimation. To introduce Bayesian quantile regression, Yu and Moyeed (2001) use an equivalent formulation for a quantile regression that assumes an asymmetric Laplace distribution for the likelihood function. Bayesian quantile regression combines this likelihood formulation with priors for model parameters to form a posterior model and uses Markov chain Monte Carlo (MCMC) for estimation. This provides full posterior distributions of model parameters for comprehensive inference, including model-based “standard errors”.

In classical quantile regression, standard errors are computed by using bootstrap or kernel-based methods. In the Bayesian framework, posterior standard deviations play the role of standard errors. By assuming a parametric likelihood model, the posterior standard deviations are estimated based on that model and may be more efficient.

Here we demonstrate a univariate Bayesian quantile regression. For other Bayesian quantile models, including random effects and multiple quantiles, see Bayesian asymmetric Laplace model.

Let's see it work

Let's explore the relationship between household income and food expenditure using the data from Engel (1857), which are described in Koenker and Bassett (1982). Let's use a quantile regression to compare this relationship across different quantiles. We first fit a model to the 50th percentile of the outcome variable, a model known as median regression, using default settings. We specify the rseed(19) option for reproducibility.

. webuse engel1857
(European household budget survey)

. bayes, rseed(19): qreg foodexp income

Burn-in ...
Simulation ...

Model summary


Likelihood: 
  foodexp ~ asymlaplaceq(xb_foodexp_q50,{sigma},.5)

Priors: 
  {foodexp_q50:income _cons} ~ normal(0,10000)                             (1)
                     {sigma} ~ igamma(0.01,0.01)

(1) Parameters are elements of the linear form xb_foodexp_q50.

Bayesian quantile regression                     MCMC iterations  =     12,500
Random-walk Metropolis–Hastings sampling         Burn-in          =      2,500
                                                 MCMC sample size =     10,000
Quantile = .5                                    Number of obs    =        235
                                                 Acceptance rate  =      .3603
                                                 Efficiency:  min =     .09896
                                                              avg =       .151
Log marginal-likelihood =  186.43947                          max =      .2268



                                                               Equal-tailed   
                    Mean   Std. dev.     MCSE     Median  [95% cred. interval]
   
foodexp_q50                                                                   
      income    .5567276   .0159401   .000507   .5562547   .5248025    .587735
       _cons     .084986   .0143782   .000403   .0851108   .0575581   .1134264
   
       sigma    .0377533   .0024907   .000052   .0376511   .0331066   .0430957

The mean posterior estimate for the coefficient of income is 0.56 with a 95% credible interval (CrI) of [0.52, 0.59]. We now shift our attention to the 25th percentile (or 0.25 quantile) of the outcome variable by specifying the quantile() option with qreg.

. bayes, rseed(19): qreg foodexp income, quantile(0.25)

Burn-in ...
Simulation ...

Model summary


Likelihood: 
  foodexp ~ asymlaplaceq(xb_foodexp_q25,{sigma},.25)

Priors: 
  {foodexp_q25:income _cons} ~ normal(0,10000)                             (1)
                     {sigma} ~ igamma(0.01,0.01)


(1) Parameters are elements of the linear form xb_foodexp_q25.

Bayesian quantile regression                     MCMC iterations  =     12,500
Random-walk Metropolis–Hastings sampling         Burn-in          =      2,500
                                                 MCMC sample size =     10,000
Quantile = .25                                   Number of obs    =        235
                                                 Acceptance rate  =      .3423
                                                 Efficiency:  min =      .1436
                                                              avg =      .1765
Log marginal-likelihood =  169.18624                          max =      .2421



                                                               Equal-tailed   
                    Mean   Std. dev.     MCSE     Median  [95% cred. interval]
   
foodexp_q25                                                                   
      income    .4718604   .0140225    .00037   .4735463   .4414884   .4948657
       _cons    .0962851   .0116976   .000308   .0957929   .0742573   .1196877
   
       sigma    .0304463   .0020364   .000041   .0303373   .0266857   .0347907

The mean posterior estimate for the coefficient of income is 0.47 with a 95% CrI of [0.44, 0.49]. The CrIs from the two quantile regressions are not overlapping, which suggests that the relationship between income and food expenditure is different between the 0.25 and 0.50 quantiles. We can explore this relationship further by specifying different quantiles in the quantile() option with bayes: qreg and using the results to produce the graph below:

The graph demonstrates heterogeneity of the income coefficients across the distribution (quantiles) of food expenditure. The coefficient increases as the quantile value increases. (This does not mean that the proportion of income spent on food increases with income. If we were to obtain predictions and produce the Engel curve, we would see that food expenditure share decreases with income, as expected.)

All existing Bayesian postestimation commands are available after bayes: qreg. For example, we can compute the posterior probability for the income coefficient in the model for the 25th percentile of foodexp to be within the interval [0.525, 0.588]—the 95% CrI obtained from the median regression model. To accomplish this, we use the bayestest interval command.

. bayestest interval {foodexp_q25:income}, lower(.525) upper(.588)

Interval tests     MCMC sample size =    10,000

       prob1 : .525 < {foodexp_q25:income} < .588



                    Mean    Std. dev.      MCSE
   
       prob1           0     0.00000          0

The estimated posterior probability is 0, which suggests that the effect of income on food expenditure differs between the 25th and 50th percentiles.

References

Engel, E. 1857. Die Productions-und Consumtionsver-haltnisse des Konigreichs Sachsen. Zeitschrift des Statistischen Bureaus des Koniglich Sachsischen Ministeriums des Innern 8: 1–54.

Koenker, R., and G. Bassett, Jr. 1982. Robust tests for heteroscedasticity based on regression quantiles. Econometrica 50: 43–61. https://doi.org/10.2307/1912528.

Yu, K., and R. A. Moyeed. 2001. Bayesian quantile regression. Statistics & Probability Letters 54: 437–447.

Tell me more

Learn more about Stata's Bayesian analysis features.

Read more about Bayesian analysis in the Stata Bayesian Analysis Reference Manual; see [BAYES] bayes: qreg.

Also see Bayesian asymmetric Laplace model.

View all the new features in Stata 18.

Made for data science.

Get started today.

Order

Upgrade