Title | Why is there no intercept in lasso inferential commands? Is it possible to get an intercept? | |
Author | Miguel Dorta, StataCorp |
The lasso inferential commands implement three lasso-based methods for estimating the coefficients and standard errors of specified variables of interest and for selecting from potential control covariates to be included in the model. The methods are double selection, partialing-out, and cross-fit partialing-out. For each of them, there are commands for linear, logistic, and Poisson regression models. Also, for both partialing-out and cross-fit partialing-out, there is a command for instrumental-variable linear regression.
This FAQ has been structured as follows:
All the implemented methods perform a lasso stage, where multiple lasso models are fit to select controls, and a final estimation stage, where the coefficients and standard errors for the variables of interest are computed. The intercept is regarded as one of the controls (treated as always included); therefore, if an intercept is also added as a variable of interest, it will be perfectly collinear with the intercept in the controls. We do not report the selected controls because their standard errors would not be valid. Therefore, we do not report the intercept.
For the double-selection lasso regression commands (dsregress, dslogit, and dspoisson), a point estimate of the intercept can be computed in the final estimation stage.
In the examples below, we show you how to compute a point estimate of the intercept. We begin by loading one of the datasets used in the documentation.
. use https://www.stata-press.com/data/r17/breathe, clear (Nitrogen dioxide and attention)
Next we create a global macro for a list of potential control covariates using factor-variable syntax, which implies 41 potential control covariates.
. global controlvars i.(sex grade overweight feducation msmoke)##c.(sev_home age)
In the examples below, we will be using the same dataset and the global macro controlvars.
Example 1: dsregress
We fit a double-selection linear model for the react variable, specifying no2_class and no2_home as variables of interest and the controls from the global macro controlvars.
. dsregress react no2_class no2_home, controls($controlvars) Estimating lasso for react using plugin Estimating lasso for no2_class using plugin Estimating lasso for no2_home using plugin Double-selection linear model Number of obs = 1,053 Number of controls = 41 Number of selected controls = 7 Wald chi2(2) = 20.99 Prob > chi2 = 0.0000
Robust | ||
react | Coefficient std. err. z P>|z| [95% conf. interval] | |
no2_class | 1.94622 .4248716 4.58 0.000 1.113487 2.778953 | |
no2_home | -.3717156 .2445907 -1.52 0.129 -.8511047 .1076734 | |
The three "Estimating lasso" messages indicate that dsregress performed the lasso stage. The corresponding list of selected controls is stored in the macro e(controls_sel). This allows us to use regress to reproduce the point estimates of the final estimation stage, where the intercept takes on the value 904.10475.
. quietly regress react no2_class no2_home `e(controls_sel)' if e(sample), vce(robust) . estimates store rep_dsreg . etable, estimates(dsregress rep_dsreg) column(estimates) keep(no2_class no2_home _cons) cstat(_r_b, nformat(%9.5f)) novarlab
dsregress rep_dsreg | ||
no2_class 1.94622 1.94622 | ||
no2_home -0.37172 -0.37172 | ||
_cons 904.10475 | ||
Number of observations 1053 1053 | ||
Example 2: dslogit
Here we fit a double-selection logit model for the lbweight variable, specifying indicators for meducation as variables of interest, and we use the same potential controls from Example 1.
. dslogit lbweight i.meducation, controls($controlvars) Estimating lasso for lbweight using plugin Estimating lasso for 2bn.meducation using plugin Estimating lasso for 3bn.meducation using plugin Estimating lasso for 4bn.meducation using plugin Double-selection logit model Number of obs = 1,058 Number of controls = 41 Number of selected controls = 6 Wald chi2(3) = 1.70 Prob > chi2 = 0.6361
Robust | ||
lbweight | Odds ratio std. err. z P>|z| [95% conf. interval] | |
meducation | ||
Primary | .3385649 .4093585 -0.90 0.370 .0316559 3.621004 | |
Secondary | .2286818 .2718619 -1.24 0.215 .0222487 2.35049 | |
University | .2514901 .3000166 -1.16 0.247 .0242703 2.605953 | |
Similarly, dslogit performed a corresponding lasso stage and stored the selected controls in the macro e(controls_sel). So we can use logit to reproduce the point estimates of the final estimation stage, where the value for the implicit intercept is 0.09494.
. quietly logit lbweight i.meducation `e(controls_sel)' if e(sample), or vce(robust) . estimates store rep_dslog . etable, estimates(dslogit rep_dslog) column(estimates) keep(meducation _cons) cstat(_r_b, nformat(%9.5f)) novarlab
dslogit rep_dslog | ||
meducation | ||
Primary 0.33856 0.33856 | ||
Secondary 0.22868 0.22868 | ||
University 0.25149 0.25149 | ||
_cons 0.09494 | ||
Number of observations 1058 1058 | ||
Example 3: dspoisson
Now we fit a double-selection Poisson model for the correct variable, specifying no2_class and no2_home as variables of interest, and we specify the same potential controls as in the previous examples.
. dspoisson correct no2_class no2_home, controls($controlvars) Estimating lasso for correct using plugin Estimating lasso for no2_class using plugin Estimating lasso for no2_home using plugin Double-selection Poisson model Number of obs = 1,053 Number of controls = 41 Number of selected controls = 3 Wald chi2(2) = 9.36 Prob > chi2 = 0.0093
Robust | ||
correct | IRR std. err. z P>|z| [95% conf. interval] | |
no2_class | .9993293 .0002192 -3.06 0.002 .9988997 .9997591 | |
no2_home | 1.000062 .0000966 0.64 0.521 .9998728 1.000251 | |
Analogously, dspoisson performed a corresponding lasso stage and stored the selected controls in the macro e(controls_sel). We can then reproduce the point estimates for the final estimation stage showing the value of the implicit intercept (111.52364).
. quietly poisson correct no2_class no2_home `e(controls_sel)' if e(sample), irr vce(robust) . estimates store rep_dspoi . etable, estimates(dspoisson rep_dspoi) column(estimates) keep(no2_class no2_home _cons) cstat(_r_b, nformat(%9.5f)) novarlabel
dspoisson rep_dspoi | ||
no2_class 0.99933 0.99933 | ||
no2_home 1.00006 1.00006 | ||
_cons 111.52364 | ||
Number of observations 1053 1053 | ||
The partialing-out commands are poregress, pologit, popoisson, and poivregress. The cross-fit partialing-out commands are xporegress, xpologit, xpopoisson, and xpoivregress. For all of these commands, the final estimation stage is performed using partial-covariate variables (zero-mean residuals). Therefore, if an intercept would be included in the final estimation stage, its value would be zero like in the next example. Consequently, an intercept in terms of the original covariates cannot be computed from those commands.
Example 4: poregress
The code below reproduces the point estimates of poregress demonstrating that, if an intercept would be included, its value would be virtually zero.
poregress react no2_class no2_home, controls($controlvars) estimates store poregress mark touse if e(sample) local sel1 `e(lasso_selected_1)' local sel2 `e(lasso_selected_2)' local sel3 `e(lasso_selected_3)' quietly { regress react `sel1' if touse predict double res_react, residual regress no2_class `sel2' if touse predict double res_no2_class, residual regress no2_home `sel3' if touse predict double res_no2_home, residual regress res_react res_no2_class res_no2_home if touse, vce(robust) estimates store rep_poreg } etable, estimates(poregress rep_poreg) column(estimates) cstat(_r_b, nformat(%9.5f)) novarlab
And this is the output of the last command:
. etable, estimates(poregress rep_poreg) column(estimates) cstat(_r_b, nformat(%9.5f)) novarlab
poregress rep_poreg | ||
no2_class 1.91259 | ||
no2_home -0.35376 | ||
res_no2_class 1.91259 | ||
res_no2_home -0.35376 | ||
_cons 0.00000 | ||
Number of observations 1053 1053 | ||
For the other po and xpo commands, manually reproducing the final estimation stage is not straightforward as in poregress. That being said, partial-covariate variables are used similarly, so there cannot be an intercept.