Title | Compute least-squares means using margins, asbalanced | |
Authors |
Alex Asher, StataCorp Miguel Dorta, StataCorp Mia Lv, StataCorp |
This FAQ demonstrates how to calculate least-squares means in Stata. A least-squares mean is a population mean that is calculated using parameter estimates from a statistical model. But where does the name “least-squares mean” come from? The first use of the expression “least-squares mean” involved linear models, where parameter estimates were obtained using the method of least squares. In that context, the term “least-squares” served to emphasize that the mean was calculated using parameter estimates. This is in contrast to the arithmetic mean, which is calculated using the observed data.
The concept of a least-squares mean makes no requirement that the least-squares method be used to calculate parameter estimates, so Searle, Speed, and Milliken (1980) introduced the more descriptive name "estimated marginal mean" (EMM). We treat the two terms as synonyms, and they describe a marginal mean that is calculated under the assumption that the levels of each factor covariate are balanced (with the same proportion across levels), including interaction terms.
You can obtain these estimates using the margins command with the asbalanced option after fitting a regression model. This FAQ demonstrates how to calculate and interpret EMMs in several scenarios.
This FAQ is organized as follows:
We use the classic auto dataset, and we regress the continuous outcome variable mpg on two categorical predictor variables: foreign and rep78. Factor-variable notation i.foreign and i.rep78 tells Stata that foreign and rep78 are categorical (factor) variables.
. sysuse auto, clear (1978 automobile data) . regress mpg i.foreign i.rep78
Source | SS df MS | Number of obs = 69F(5, 63) = 4.96 | |
Model | 661.189524 5 132.237905 | Prob > F = 0.0007 | |
Residual | 1679.01337 63 26.6510059 | R-squared = 0.2825 | Adj R-squared = 0.2256 |
Total | 2340.2029 68 34.4147485 | Root MSE = 5.1625 |
mpg | Coefficient Std. err. t P>|t| [95% conf. interval] | |
foreign | ||
Foreign | 3.556584 1.736681 2.05 0.045 .0861046 7.027064 | |
rep78 | ||
2 | -1.875 4.081284 -0.46 0.648 -10.0308 6.280795 | |
3 | -1.922325 3.774126 -0.51 0.612 -9.464315 5.619665 | |
4 | -1.111626 3.944633 -0.28 0.779 -8.994346 6.771095 | |
5 | 3.453704 4.215132 0.82 0.416 -4.969565 11.87697 | |
_cons | 21 3.650411 5.75 0.000 13.70524 28.29476 | |
After fitting the linear regression model, we want to calculate the grand (overall) EMM for the model, so we execute the margins command without a marginslist. We specify option asbalanced to calculate the mean mpg that we would expect from a population with equal numbers of foreign and domestic automobiles and equal numbers of cars at each level of rep78.
. margins, asbalanced Adjusted predictions Number of obs = 69 Model VCE: OLS Expression: Linear prediction, predict() At: foreign (asbalanced) rep78 (asbalanced)
Delta-method | ||
Margin std. err. t P>|t| [95% conf. interval] | ||
_cons | 22.48724 .9995692 22.50 0.000 20.48976 24.48472 | |
The EMM of 22.487 indicates that if both foreign and rep78 were balanced, we would expect the average car to get nearly 22.5 miles per gallon.
But what if we wanted to calculate the EMMs for both foreign and domestic vehicles, while treating rep78 as balanced? We would put foreign in our marginslist like so:
. margins foreign, asbalanced Adjusted predictions Number of obs = 69 Model VCE: OLS Expression: Linear prediction, predict() At: rep78 (asbalanced)
Delta-method | ||
Margin std. err. t P>|t| [95% conf. interval] | ||
foreign | ||
Domestic | 20.70895 1.049089 19.74 0.000 18.61251 22.80539 | |
Foreign | 24.26553 1.551038 15.64 0.000 21.16603 27.36504 | |
Here we treat foreign as the marginslist, assuming that all other categorical variables (in this case, only rep78) are balanced.
To better understand what margins is doing, we can reproduce the above margins results by manually specifying the values of foreign and rep78 using the at() option. Variable foreign takes two values and rep78 takes five, so to treat them as balanced, we calculate our margin with both levels of foreign at \(\frac{1}{2}\) and each level of rep78 at \(\frac{1}{5}\).
. * reproduce the first example: -margins, asbalanced- . margins, at(0.foreign=0.5 1.foreign=0.5 1.rep78=0.2 2.rep78=0.2 3.rep78=0.2 4.rep78=0.2 5.rep78=0.2) Adjusted predictions Number of obs = 69 Model VCE: OLS Expression: Linear prediction, predict() At: 0.foreign = .5 1.foreign = .5 1.rep78 = .2 2.rep78 = .2 3.rep78 = .2 4.rep78 = .2 5.rep78 = .2
Delta-method | ||
Margin std. err. t P>|t| [95% conf. interval] | ||
_cons | 22.48724 .9995691 22.50 0.000 20.48976 24.48472 | |
To reproduce the second example, we repeat the at() option, setting foreign to 0 in the first instance and setting it to 1 in the second instance. In both cases, we set each level of rep78 to \(\frac{{1}}{5}\).
. * reproduce the second example: -margins foreign, asbalanced- . margins, at(0.foreign=1 1.rep78=0.2 2.rep78=0.2 3.rep78=0.2 4.rep78=0.2 5.rep78=0.2) at(1.foreign=1 1.rep78=0.2 2.rep78=0.2 3.rep78=0.2 4.rep78=0.2 5.rep78=0.2) Predictive margins Number of obs = 69 Model VCE: OLS Expression: Linear prediction, predict() 1._at: foreign = 0 1.rep78 = .2 2.rep78 = .2 3.rep78 = .2 4.rep78 = .2 5.rep78 = .2 2._at: foreign = 1 1.rep78 = .2 2.rep78 = .2 3.rep78 = .2 4.rep78 = .2 5.rep78 = .2
Delta-method | ||
Margin std. err. t P>|t| [95% conf. interval] | ||
_at | ||
1 | 20.70895 1.049089 19.74 0.000 18.61251 22.80539 | |
2 | 24.26553 1.551038 15.64 0.000 21.16603 27.36504 | |
When calculating EMMs in the presence of continuous covariates, some authors recommend setting the continuous covariates to their mean values. This approach yields what Searle, Speed, and Milliken (1980) refer to as adjusted treatment means. Adjusted treatment means can be obtained using the asbalanced and atmeans options together or, equivalently, by specifying option at((asbalanced) _factor (mean) _continuous). Users coming from a SAS background will recognize the adjusted treatment mean as the default margin calculated by SAS’s LSMEANS statement.
However, if you prefer not to set continuous covariates to their means, you can omit the atmeans option. In this case, the margins command will use the observed values of the continuous covariates rather than assuming specific values. This is known as a predictive margin.
To demonstrate, we add continuous predictor variables price and weight to the model.
. regress mpg i.foreign i.rep78 price weight
Source | SS df MS | Number of obs = 69F(7, 61) = 19.90 | |
Model | 1627.48579 7 232.49797 | Prob > F = 0.0000 | |
Residual | 712.717109 61 11.683887 | R-squared = 0.6954 | Adj R-squared = 0.6605 |
Total | 2340.2029 68 34.4147485 | Root MSE = 3.4182 |
mpg | Coefficient Std. err. t P>|t| [95% conf. interval] | |
foreign | ||
Foreign | -3.13669 1.52857 -2.05 0.044 -6.193255 -.080126 | |
rep78 | ||
2 | -.3114325 2.710206 -0.11 0.909 -5.730825 5.10796 | |
3 | -.0729958 2.513312 -0.03 0.977 -5.098675 4.952683 | |
4 | .6498104 2.62137 0.25 0.805 -4.591943 5.891564 | |
5 | 3.799259 2.800019 1.36 0.180 -1.799725 9.398244 | |
price | .0000604 .0002015 0.30 0.765 -.0003424 .0004633 | |
weight | -.0064961 .0009727 -6.68 0.000 -.0084411 -.0045511 | |
_cons | 40.862 3.447222 11.85 0.000 33.96885 47.75514 | |
First, let's calculate the adjusted treatment means for foreign and domestic vehicles. In addition to option asbalanced, we add option atmeans, which instructs margins to set continuous variables price and weight to their means.
. margins foreign, asbalanced atmeans Adjusted predictions Number of obs = 69 Model VCE: OLS Expression: Linear prediction, predict() At: rep78 (asbalanced) price = 6146.043 (mean) weight = 3032.029 (mean)
Delta-method | ||
Margin std. err. t P>|t| [95% conf. interval] | ||
foreign | ||
Domestic | 22.35008 .7562743 29.55 0.000 20.83781 23.86234 | |
Foreign | 19.21339 1.250939 15.36 0.000 16.71198 21.71479 | |
In the output, price is set to its mean value of 6146.043, and weight is set to its mean value of 3032.029.
Adjusted treatment means can also be specified using the at() option. Here we use at() to treat factor variables as balanced and set continuous variables equal to their means.
. margins foreign, at((asbalanced) _factor (mean) _continuous) Adjusted predictions Number of obs = 69 Model VCE: OLS Expression: Linear prediction, predict() At: rep78 (asbalanced) price = 6146.043 (mean) weight = 3032.029 (mean)
Delta-method | ||
Margin std. err. t P>|t| [95% conf. interval] | ||
foreign | ||
Domestic | 22.35008 .7562743 29.55 0.000 20.83781 23.86234 | |
Foreign | 19.21339 1.250939 15.36 0.000 16.71198 21.71479 | |
We can even specify the means of continuous variables price and weight directly in the at() option:
. margins foreign, asbalanced at(price=6146.043 weight=3032.029) Adjusted predictions Number of obs = 69 Model VCE: OLS Expression: Linear prediction, predict() At: rep78 (asbalanced) price = 6146.043 weight = 3032.029
Delta-method | ||
Margin std. err. t P>|t| [95% conf. interval] | ||
foreign | ||
Domestic | 22.35008 .7562742 29.55 0.000 20.83781 23.86234 | |
Foreign | 19.21339 1.250939 15.36 0.000 16.71198 21.71479 | |
Each of these equivalent syntaxes for calculating the adjusted treatment mean yields the same inference.
Next, let's consider a scenario where we choose not to set continuous variables to their mean values. In this case, margins treats the continuous variables “as observed”, calculating a predicted outcome for each observation and averaging these predictions to yield an EMM known as the predictive marginal mean.
. margins foreign, asbalanced Predictive margins Number of obs = 69 Model VCE: OLS Expression: Linear prediction, predict() At: rep78 (asbalanced)
Delta-method | ||
Margin std. err. t P>|t| [95% conf. interval] | ||
foreign | ||
Domestic | 22.35008 .7562743 29.55 0.000 20.83781 23.86234 | |
Foreign | 19.21339 1.250939 15.36 0.000 16.71198 21.7148 | |
In this case, the categorical variable rep78 is treated as balanced, but continuous variables price and weight are left as observed. Interestingly, the predictive marginal means for foreign and domestic cars are the same as the adjusted treatment means we calculated using the atmeans option. This is because we used a linear regression model that doesn’t contain any polynomial terms, so the average of the predictions is the same as the prediction at the average values of price and weight.
For an example where atmeans does have an effect, please see section 3 below, where the model contains a quadratic effect. The atmeans option also affects margins after logistic regression and other generalized linear models.
The asbalanced option only affects factor variables, so the previous command
. margins foreign, asbalanced
did not specify how continuous variables price and weight were to be treated; by default, they are left as observed. We get the same predicted marginal means by explicitly setting continuous variables asobserved.
. margins foreign, asbalanced at((asobserved) _continuous) (output omitted)
Now we consider a more complicated model with categorical predictors foreign and rep78, continuous predictor turn, and interactions between the predictors. We use the notation c.turn to tell Stata that turn is a continuous variable, and we use # to indicate interactions between variables. The interaction c.turn#c.turn is the quadratic effect of variable turn (that is, the effect of turn2).
. regress mpg i.foreign i.rep78 c.turn i.foreign#c.turn i.rep78#c.turn c.turn#c.turn
Source | SS df MS | Number of obs = 69F(12, 56) = 9.18 | |
Model | 1551.32495 12 129.277079 | Prob > F = 0.0000 | |
Residual | 788.877953 56 14.0871063 | R-squared = 0.6629 | Adj R-squared = 0.5907 |
Total | 2340.2029 68 34.4147485 | Root MSE = 3.7533 |
mpg | Coefficient Std. err. t P>|t| [95% conf. interval] | |
foreign | ||
Foreign | 41.02793 30.56578 1.34 0.185 -20.20268 102.2585 | |
rep78 | ||
2 | -76.43955 111.6836 -0.68 0.497 -300.1687 147.2896 | |
3 | -90.47575 109.0871 -0.83 0.410 -309.0034 128.0519 | |
4 | -79.38446 109.3388 -0.73 0.471 -298.4164 139.6475 | |
5 | -34.72543 115.9266 -0.30 0.766 -266.9544 197.5035 | |
turn | -2.395572 3.315296 -0.72 0.473 -9.036909 4.245764 | |
foreign# | ||
c.turn | ||
Foreign | -1.221608 .8543636 -1.43 0.158 -2.933104 .4898878 | |
rep78#c.turn | ||
2 | 1.885053 2.715978 0.69 0.491 -3.555705 7.325811 | |
3 | 2.17802 2.659639 0.82 0.416 -3.149878 7.505918 | |
4 | 1.911901 2.665971 0.72 0.476 -3.428681 7.252482 | |
5 | .7628724 2.878065 0.27 0.792 -5.002584 6.528328 | |
c.turn#c.turn | -.0073711 .0242301 -0.30 0.762 -.0559097 .0411676 | |
_cons | 131.6166 116.2079 1.13 0.262 -101.1758 364.409 | |
To calculate predicted marginal means for foreign and domestic automobiles, we leave continuous variables as observed.
. margins foreign, asbalanced Predictive margins Number of obs = 69 Model VCE: OLS Expression: Linear prediction, predict() At: rep78 (asbalanced)
Delta-method | ||
Margin std. err. t P>|t| [95% conf. interval] | ||
foreign | ||
Domestic | 21.88644 1.432176 15.28 0.000 19.01745 24.75544 | |
Foreign | 14.29791 3.494918 4.09 0.000 7.296743 21.29907 | |
For the adjusted treatment means, we add option atmeans.
. margins foreign, asbalanced atmeans Adjusted predictions Number of obs = 69 Model VCE: OLS Expression: Linear prediction, predict() At: rep78 (asbalanced) turn = 39.7971 (mean)
Delta-method | ||
Margin std. err. t P>|t| [95% conf. interval] | ||
foreign | ||
Domestic | 22.02972 1.406687 15.66 0.000 19.21178 24.84765 | |
Foreign | 14.44118 3.376853 4.28 0.000 7.676532 21.20583 | |
Now option atmeans has a visible impact: the adjusted treatment means are different from the predicted marginal means. But why?
As we saw in section 2, when calculating margins based on a linear regression model, leaving a predictor variable “as observed” yields the same marginal mean as setting that predictor variable equal to its mean. When we specify atmeans, we are setting turn equal to its mean of 39.7971. The difference comes from the quadratic effect of turn. Because we specified c.turn#c.turn, Stata knows to take the mean of turn before squaring, setting the interaction equal to 39.79712 = 1583.8092. When we omit atmeans and treat continuous predictors as observed, the quadratic term is squared before averaging predictions. The square of the average is not the same as the average of the squares, so the margins are different.
This is one of the advantages of using factor-variable notation. If you were to manually generate the quadratic term turn*turn as a new variable and include it in the regression model, Stata's margins command wouldn't be able to recognize the relationship between turn and the quadratic term:
. generate turn2 = turn*turn . regress mpg i.foreign i.rep78 c.turn i.foreign#c.turn i.rep78#c.turn c.turn2 (output omitted) . margins foreign, asbalanced atmeans Adjusted predictions Number of obs = 69 Model VCE: OLS Expression: Linear prediction, predict() At: rep78 (asbalanced) turn = 39.7971 (mean) turn2 = 1603.246 (mean)
Delta-method | ||
Margin std. err. t P>|t| [95% conf. interval] | ||
foreign | ||
Domestic | 21.88644 1.432177 15.28 0.000 19.01745 24.75544 | |
Foreign | 14.29791 3.494918 4.09 0.000 7.296746 21.29907 | |
Because we created turn2 as a new variable, margins sees turn2 as a regular continuous variable and sets its value equal to its mean, 1603.246, which does not equal the square of the mean of turn (39.79712 = 1583.8092). This yields margins that are numerically equivalent to the predicted marginal means we calculated by omitting atmeans.
Similarly, when it comes to interactions between two categorical variables, Stata knows how to determine the appropriate proportion for each level of the interaction when asbalanced is specified. For example, consider a linear regression of price on categorical predictors foreign and rep78, as well as their interaction. We include the condition if rep78 > 2 to restrict our analysis to the subset of data where rep78 is greater than two.
. regress mpg i.foreign i.rep78 i.foreign#i.rep78 if rep78 > 2
Source | SS df MS | Number of obs = 59F(5, 53) = 6.10 | |
Model | 796.45951 5 159.291902 | Prob > F = 0.0002 | |
Residual | 1383.77778 53 26.1090147 | R-squared = 0.3653 | Adj R-squared = 0.3054 |
Total | 2180.23729 58 37.5902981 | Root MSE = 5.1097 |
mpg | Coefficient Std. err. t P>|t| [95% conf. interval] | |
foreign | ||
Foreign | 4.333333 3.109663 1.39 0.169 -1.903861 10.57053 | |
rep78 | ||
4 | -.5555556 1.966724 -0.28 0.779 -4.500304 3.389193 | |
5 | 13 3.74453 3.47 0.001 5.489423 20.51058 | |
foreign#rep78 | ||
Foreign#4 | 13 3.74453 3.47 0.001 5.489423 20.51058 | |
Foreign#5 | -10 5.062165 -1.98 0.053 -20.15342 .1534172 | |
_cons | 19 .9833619 19.32 0.000 17.02763 20.97237 | |
To calculate the overall (grand) EMM, we specify margins, asbalanced to weight each factor level and interaction evenly.
. margins, asbalanced Adjusted predictions Number of obs = 59 Model VCE: OLS Expression: Linear prediction, predict() At: foreign (asbalanced) rep78 (asbalanced)
Delta-method | ||
Margin std. err. t P>|t| [95% conf. interval] | ||
_cons | 24 .9343375 25.69 0.000 22.12596 25.87404 | |
Factor variable foreign takes two values (0 and 1), and we have restricted rep78 to three values (3, 4, and 5), so our EMM is calculated with both levels of foreign set at \(\frac{1}{2}\) and each level of rep78 set at \(\frac{1}{3}\). There are six levels of the i.foreign#i.rep78 interaction, so balancing sets each level of the interaction equal to \(\frac{1}{6}\). To calculate the EMM manually, we access the model coefficients using _b[varname] notation.
. scalar EMM = _b[_cons] + 1/2 * _b[0.foreign] + 1/2 * _b[1.foreign] + 1/3 * _b[3.rep78] + 1/3 * _b[4.rep78] + 1/3 * _b[5.rep78] + 1/6 * _b[0.foreign#3.rep78] + 1/6 * _b[1.foreign#3.rep78] + 1/6 * _b[0.foreign#4.rep78] + 1/6 * _b[1.foreign#4.rep78] + 1/6 * _b[0.foreign#5.rep78] + 1/6 * _b[1.foreign#5.rep78] . display "EMM = " EMM EMM = 24
When we include interactions between two categorical variables and the data contain empty cells (combinations of factor levels with no observations), the EMM is not estimable. To demonstrate, we return to the model with a categorical-by-categorical interaction we saw in section 3, but we remove the restriction that rep78 be greater than two.
. regress mpg i.foreign i.rep78 i.foreign#i.rep78 note: 1.foreign#1b.rep78 identifies no observations in the sample. note: 1.foreign#2.rep78 identifies no observations in the sample. note: 1.foreign#5.rep78 omitted because of collinearity.
Source | SS df MS | Number of obs = 69F(7, 61) = 4.88 | |
Model | 839.550121 7 119.935732 | Prob > F = 0.0002 | |
Residual | 1500.65278 61 24.6008652 | R-squared = 0.3588 | Adj R-squared = 0.2852 |
Total | 2340.2029 68 34.4147485 | Root MSE = 4.9599 |
mpg | Coefficient Std. err. t P>|t| [95% conf. interval] | |
foreign | ||
Foreign | -5.666667 3.877352 -1.46 0.149 -13.41991 2.086579 | |
rep78 | ||
2 | -1.875 3.921166 -0.48 0.634 -9.715855 5.965855 | |
3 | -2 3.634773 -0.55 0.584 -9.268178 5.268178 | |
4 | -2.555556 3.877352 -0.66 0.512 -10.3088 5.19769 | |
5 | 11 4.959926 2.22 0.030 1.082015 20.91798 | |
foreign#rep78 | ||
Foreign#1 | 0 (empty) | |
Foreign#2 | 0 (empty) | |
Foreign#3 | 10 4.913786 2.04 0.046 .1742775 19.82572 | |
Foreign#4 | 12.11111 4.527772 2.67 0.010 3.057271 21.16495 | |
Foreign#5 | 0 (omitted) | |
_cons | 21 3.507197 5.99 0.000 13.98693 28.01307 | |
The note at the top of the output informs us that two of the combinations of rep78 and foreign identify no observations in the sample: 1.foreign#1.rep78 and 1.foreign#2.rep78 are empty cells. Nevertheless, we attempt to calculate the EMM, which assumes equal numbers of observations at each level of foreign and rep78 and at each combination of i.foreign#i.rep78.
. margins, asbalanced Adjusted predictions Number of obs = 69 Model VCE: OLS Expression: Linear prediction, predict() At: foreign (asbalanced) rep78 (asbalanced)
Delta-method | ||
Margin std. err. t P>|t| [95% conf. interval] | ||
_cons | . (not estimable) | |
In this case, it is impossible to calculate the EMM because the regression model has no parameter estimates for the empty cells. To get around this limitation, Searle, Speed, and Milliken (2000) introduced the concept of modified EMMs, where margins involving empty cells are redefined so that they become estimable. You can use the emptycells(reweight) option to implement this method, which adjusts the weights of the observed cells to account for the empty cells.
. margins, asbalanced emptycells(reweight) Adjusted predictions Number of obs = 69 Model VCE: OLS Expression: Linear prediction, predict() Empty cells: reweight At: foreign (asbalanced) rep78 (asbalanced)
Delta-method | ||
Margin std. err. t P>|t| [95% conf. interval] | ||
_cons | 23.01563 .8384108 27.45 0.000 21.33912 24.69213 | |
To identify the empty cells, we create a frequency table for foreign and rep78:
. * drop value label "origin" to see that -foreign- takes values 0 and 1 . label drop origin . table foreign rep78, nototals zerocounts
Repair record 1978 | ||
1 2 3 4 5 | ||
Car origin | ||
0 | 2 8 27 9 2 | |
1 | 0 0 3 9 9 | |
The two levels of foreign and five levels of rep78 combine to create 10 interactions or “cells”, two of which are empty because there are no observations with 1.foreign#1.rep78 or 1.foreign#2.rep78. To calculate the modified EMM, margins ignores the two empty cells and constructs weights proportional to the number of nonempty cells. For example, foreign equals 0 in five of the eight nonempty cells, and it equals 1 in the other three nonempty cells. Option emptycells(reweight) tells margins to apply a weight of \(\frac{5}{8}\) to 0.foreign and a weight of \(\frac{3}{8}\) to 1.foreign. To calculate the modified EMM manually, we weight each factor and interaction proportional to the number of nonempty cells.
. scalar mEMM = _b[_cons] + 5/8 * _b[0.foreign] + 3/8 * _b[1.foreign] + 1/8 * _b[1.rep78] + 1/8 * _b[2.rep78] + 2/8 * _b[3.rep78] + 2/8 * _b[4.rep78] + 2/8 * _b[5.rep78] + 1/8 * _b[0.foreign#1.rep78] + 1/8 * _b[0.foreign#2.rep78] + 1/8 * _b[0.foreign#3.rep78] + 1/8 * _b[1.foreign#3.rep78] + 1/8 * _b[0.foreign#4.rep78] + 1/8 * _b[1.foreign#4.rep78] + 1/8 * _b[0.foreign#5.rep78] + 1/8 * _b[1.foreign#5.rep78] . display "Modified EMM = " mEMM Modified EMM = 23.015625
This result is consistent with the modified EMM calculated by margins with options asbalanced and emptycells(reweight).
Searle, S. R., F. M. Speed, and G. A. Milliken. 1980. Population marginal means in the linear model: An alternative to least squares means. American Statistician 34: 216–221. https://doi.org/10.2307/2684063.