FAQ: How do I calculate least-squares means in Stata?

Home / Resources & support / FAQs / How do I calculate least-squares means in Stata?

How do I calculate least-squares means in Stata?

Title		Compute least-squares means using margins, asbalanced
Authors		Alex Asher, StataCorp Miguel Dorta, StataCorp Mia Lv, StataCorp

This FAQ demonstrates how to calculate least-squares means in Stata. A least-squares mean is a population mean that is calculated using parameter estimates from a statistical model. But where does the name “least-squares mean” come from? The first use of the expression “least-squares mean” involved linear models, where parameter estimates were obtained using the method of least squares. In that context, the term “least-squares” served to emphasize that the mean was calculated using parameter estimates. This is in contrast to the arithmetic mean, which is calculated using the observed data.

The concept of a least-squares mean makes no requirement that the least-squares method be used to calculate parameter estimates, so Searle, Speed, and Milliken (1980) introduced the more descriptive name "estimated marginal mean" (EMM). We treat the two terms as synonyms, and they describe a marginal mean that is calculated under the assumption that the levels of each factor covariate are balanced (with the same proportion across levels), including interaction terms.

You can obtain these estimates using the margins command with the asbalanced option after fitting a regression model. This FAQ demonstrates how to calculate and interpret EMMs in several scenarios.

This FAQ is organized as follows:

1. EMMs after a model including only categorical covariates
2. EMMs after a model including both categorical and continuous covariates
3. EMMs after a model including categorical and continuous covariates, as well as interactions
4. EMMs after a model with empty cells

1. EMMs after a model including only categorical covariates

We use the classic auto dataset, and we regress the continuous outcome variable mpg on two categorical predictor variables: foreign and rep78. Factor-variable notation i.foreign and i.rep78 tells Stata that foreign and rep78 are categorical (factor) variables.

. sysuse auto, clear
(1978 automobile data)

. regress mpg i.foreign i.rep78

   Number of obs   =        69
      Source         SS           df       MS   
      F(5, 63)        =      4.96
       Model    661.189524         5  132.237905    Prob > F        =    0.0007
    Residual    1679.01337        63  26.6510059    R-squared       =    0.2825
      Adj R-squared   =    0.2256
       Total     2340.2029        68  34.4147485    Root MSE        =    5.1625




         mpg   Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
   
     foreign                                                                  
    Foreign      3.556584   1.736681     2.05   0.045     .0861046    7.027064
                                                                              
       rep78                                                                  
          2        -1.875   4.081284    -0.46   0.648     -10.0308    6.280795
          3     -1.922325   3.774126    -0.51   0.612    -9.464315    5.619665
          4     -1.111626   3.944633    -0.28   0.779    -8.994346    6.771095
          5      3.453704   4.215132     0.82   0.416    -4.969565    11.87697
                                                                              
       _cons           21   3.650411     5.75   0.000     13.70524    28.29476

After fitting the linear regression model, we want to calculate the grand (overall) EMM for the model, so we execute the margins command without a marginslist. We specify option asbalanced to calculate the mean mpg that we would expect from a population with equal numbers of foreign and domestic automobiles and equal numbers of cars at each level of rep78.

. margins, asbalanced

Adjusted predictions                                        Number of obs = 69
Model VCE: OLS

Expression: Linear prediction, predict()
At: foreign   (asbalanced)
    rep78     (asbalanced)



                          Delta-method                                        
                   Margin   std. err.      t    P>|t|     [95% conf. interval]
   
       _cons     22.48724   .9995692    22.50   0.000     20.48976    24.48472

The EMM of 22.487 indicates that if both foreign and rep78 were balanced, we would expect the average car to get nearly 22.5 miles per gallon.

But what if we wanted to calculate the EMMs for both foreign and domestic vehicles, while treating rep78 as balanced? We would put foreign in our marginslist like so:

. margins foreign, asbalanced

Adjusted predictions                                        Number of obs = 69
Model VCE: OLS

Expression: Linear prediction, predict()
At: rep78     (asbalanced)



                          Delta-method                                        
                   Margin   std. err.      t    P>|t|     [95% conf. interval]
   
     foreign                                                                  
   Domestic      20.70895   1.049089    19.74   0.000     18.61251    22.80539
    Foreign      24.26553   1.551038    15.64   0.000     21.16603    27.36504

Here we treat foreign as the marginslist, assuming that all other categorical variables (in this case, only rep78) are balanced.

To better understand what margins is doing, we can reproduce the above margins results by manually specifying the values of foreign and rep78 using the at() option. Variable foreign takes two values and rep78 takes five, so to treat them as balanced, we calculate our margin with both levels of foreign at \(\frac{1}{2}\) and each level of rep78 at \(\frac{1}{5}\).

. * reproduce the first example: -margins, asbalanced-

. margins, at(0.foreign=0.5 1.foreign=0.5 1.rep78=0.2 2.rep78=0.2 3.rep78=0.2 
     4.rep78=0.2 5.rep78=0.2)

Adjusted predictions                                        Number of obs = 69
Model VCE: OLS

Expression: Linear prediction, predict()
At: 0.foreign = .5
    1.foreign = .5
    1.rep78   = .2
    2.rep78   = .2
    3.rep78   = .2
    4.rep78   = .2
    5.rep78   = .2



                          Delta-method                                        
                   Margin   std. err.      t    P>|t|     [95% conf. interval]
   
       _cons     22.48724   .9995691    22.50   0.000     20.48976    24.48472

To reproduce the second example, we repeat the at() option, setting foreign to 0 in the first instance and setting it to 1 in the second instance. In both cases, we set each level of rep78 to \(\frac{{1}}{5}\).

. * reproduce the second example: -margins foreign, asbalanced-

. margins, at(0.foreign=1 1.rep78=0.2 2.rep78=0.2 3.rep78=0.2 4.rep78=0.2 5.rep78=0.2) 
            at(1.foreign=1 1.rep78=0.2 2.rep78=0.2 3.rep78=0.2 4.rep78=0.2 5.rep78=0.2)

Predictive margins                                          Number of obs = 69
Model VCE: OLS

Expression: Linear prediction, predict()
1._at: foreign =  0
       1.rep78 = .2
       2.rep78 = .2
       3.rep78 = .2
       4.rep78 = .2
       5.rep78 = .2
2._at: foreign =  1
       1.rep78 = .2
       2.rep78 = .2
       3.rep78 = .2
       4.rep78 = .2
       5.rep78 = .2



                          Delta-method                                        
                   Margin   std. err.      t    P>|t|     [95% conf. interval]
   
         _at                                                                  
          1      20.70895   1.049089    19.74   0.000     18.61251    22.80539
          2      24.26553   1.551038    15.64   0.000     21.16603    27.36504

2. EMMs after a model including both categorical and continuous covariates

When calculating EMMs in the presence of continuous covariates, some authors recommend setting the continuous covariates to their mean values. This approach yields what Searle, Speed, and Milliken (1980) refer to as adjusted treatment means. Adjusted treatment means can be obtained using the asbalanced and atmeans options together or, equivalently, by specifying option at((asbalanced) _factor (mean) _continuous). Users coming from a SAS background will recognize the adjusted treatment mean as the default margin calculated by SAS’s LSMEANS statement.

However, if you prefer not to set continuous covariates to their means, you can omit the atmeans option. In this case, the margins command will use the observed values of the continuous covariates rather than assuming specific values. This is known as a predictive margin.

To demonstrate, we add continuous predictor variables price and weight to the model.

. regress mpg i.foreign i.rep78 price weight

   Number of obs   =        69
      Source         SS           df       MS   
      F(7, 61)        =     19.90
       Model    1627.48579         7   232.49797    Prob > F        =    0.0000
    Residual    712.717109        61   11.683887    R-squared       =    0.6954
      Adj R-squared   =    0.6605
       Total     2340.2029        68  34.4147485    Root MSE        =    3.4182




         mpg   Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
   
     foreign                                                                  
    Foreign      -3.13669    1.52857    -2.05   0.044    -6.193255    -.080126
                                                                              
       rep78                                                                  
          2     -.3114325   2.710206    -0.11   0.909    -5.730825     5.10796
          3     -.0729958   2.513312    -0.03   0.977    -5.098675    4.952683
          4      .6498104    2.62137     0.25   0.805    -4.591943    5.891564
          5      3.799259   2.800019     1.36   0.180    -1.799725    9.398244
                                                                              
       price     .0000604   .0002015     0.30   0.765    -.0003424    .0004633
      weight    -.0064961   .0009727    -6.68   0.000    -.0084411   -.0045511
       _cons       40.862   3.447222    11.85   0.000     33.96885    47.75514

First, let's calculate the adjusted treatment means for foreign and domestic vehicles. In addition to option asbalanced, we add option atmeans, which instructs margins to set continuous variables price and weight to their means.

. margins foreign, asbalanced atmeans

Adjusted predictions                                        Number of obs = 69
Model VCE: OLS

Expression: Linear prediction, predict()
At: rep78     (asbalanced)
    price   = 6146.043 (mean)
    weight  = 3032.029 (mean)



                          Delta-method                                        
                   Margin   std. err.      t    P>|t|     [95% conf. interval]
   
     foreign                                                                  
   Domestic      22.35008   .7562743    29.55   0.000     20.83781    23.86234
    Foreign      19.21339   1.250939    15.36   0.000     16.71198    21.71479

In the output, price is set to its mean value of 6146.043, and weight is set to its mean value of 3032.029.

Adjusted treatment means can also be specified using the at() option. Here we use at() to treat factor variables as balanced and set continuous variables equal to their means.

. margins foreign, at((asbalanced) _factor  (mean) _continuous)

Adjusted predictions                                        Number of obs = 69
Model VCE: OLS

Expression: Linear prediction, predict()
At: rep78     (asbalanced)
    price   = 6146.043 (mean)
    weight  = 3032.029 (mean)



                          Delta-method                                        
                   Margin   std. err.      t    P>|t|     [95% conf. interval]
   
     foreign                                                                  
   Domestic      22.35008   .7562743    29.55   0.000     20.83781    23.86234
    Foreign      19.21339   1.250939    15.36   0.000     16.71198    21.71479

We can even specify the means of continuous variables price and weight directly in the at() option:

. margins foreign, asbalanced at(price=6146.043 weight=3032.029)

Adjusted predictions                                        Number of obs = 69
Model VCE: OLS

Expression: Linear prediction, predict()
At: rep78     (asbalanced)
    price   = 6146.043
    weight  = 3032.029



                          Delta-method                                        
                   Margin   std. err.      t    P>|t|     [95% conf. interval]
   
     foreign                                                                  
   Domestic      22.35008   .7562742    29.55   0.000     20.83781    23.86234
    Foreign      19.21339   1.250939    15.36   0.000     16.71198    21.71479

Each of these equivalent syntaxes for calculating the adjusted treatment mean yields the same inference.

Next, let's consider a scenario where we choose not to set continuous variables to their mean values. In this case, margins treats the continuous variables “as observed”, calculating a predicted outcome for each observation and averaging these predictions to yield an EMM known as the predictive marginal mean.

. margins foreign, asbalanced

Predictive margins                                          Number of obs = 69
Model VCE: OLS

Expression: Linear prediction, predict()
At: rep78     (asbalanced)



                          Delta-method                                        
                   Margin   std. err.      t    P>|t|     [95% conf. interval]
   
     foreign                                                                  
   Domestic      22.35008   .7562743    29.55   0.000     20.83781    23.86234
    Foreign      19.21339   1.250939    15.36   0.000     16.71198     21.7148

In this case, the categorical variable rep78 is treated as balanced, but continuous variables price and weight are left as observed. Interestingly, the predictive marginal means for foreign and domestic cars are the same as the adjusted treatment means we calculated using the atmeans option. This is because we used a linear regression model that doesn’t contain any polynomial terms, so the average of the predictions is the same as the prediction at the average values of price and weight.

For an example where atmeans does have an effect, please see section 3 below, where the model contains a quadratic effect. The atmeans option also affects margins after logistic regression and other generalized linear models.

The asbalanced option only affects factor variables, so the previous command

. margins foreign, asbalanced

did not specify how continuous variables price and weight were to be treated; by default, they are left as observed. We get the same predicted marginal means by explicitly setting continuous variables asobserved.

. margins foreign, asbalanced at((asobserved) _continuous)

(output omitted)

3. EMMs after a model including categorical and continuous covariates, as well as interactions

Now we consider a more complicated model with categorical predictors foreign and rep78, continuous predictor turn, and interactions between the predictors. We use the notation c.turn to tell Stata that turn is a continuous variable, and we use # to indicate interactions between variables. The interaction c.turn#c.turn is the quadratic effect of variable turn (that is, the effect of turn²).

. regress mpg i.foreign i.rep78 c.turn i.foreign#c.turn i.rep78#c.turn c.turn#c.turn

   Number of obs   =        69
      Source         SS           df       MS   
      F(12, 56)       =      9.18
       Model    1551.32495        12  129.277079    Prob > F        =    0.0000
    Residual    788.877953        56  14.0871063    R-squared       =    0.6629
      Adj R-squared   =    0.5907
       Total     2340.2029        68  34.4147485    Root MSE        =    3.7533




         mpg    Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
   
      foreign                                                                  
     Foreign      41.02793   30.56578     1.34   0.185    -20.20268    102.2585
                                                                               
        rep78                                                                  
           2     -76.43955   111.6836    -0.68   0.497    -300.1687    147.2896
           3     -90.47575   109.0871    -0.83   0.410    -309.0034    128.0519
           4     -79.38446   109.3388    -0.73   0.471    -298.4164    139.6475
           5     -34.72543   115.9266    -0.30   0.766    -266.9544    197.5035
                                                                               
         turn    -2.395572   3.315296    -0.72   0.473    -9.036909    4.245764
                                                                               
      foreign#                                                                  
       c.turn                                                                  
     Foreign     -1.221608   .8543636    -1.43   0.158    -2.933104    .4898878
                                                                               
 rep78#c.turn                                                                  
           2      1.885053   2.715978     0.69   0.491    -3.555705    7.325811
           3       2.17802   2.659639     0.82   0.416    -3.149878    7.505918
           4      1.911901   2.665971     0.72   0.476    -3.428681    7.252482
           5      .7628724   2.878065     0.27   0.792    -5.002584    6.528328
                                                                               
c.turn#c.turn    -.0073711   .0242301    -0.30   0.762    -.0559097    .0411676
                                                                               
        _cons     131.6166   116.2079     1.13   0.262    -101.1758     364.409

To calculate predicted marginal means for foreign and domestic automobiles, we leave continuous variables as observed.

. margins foreign, asbalanced

Predictive margins                                          Number of obs = 69
Model VCE: OLS

Expression: Linear prediction, predict()
At: rep78     (asbalanced)



                          Delta-method                                        
                   Margin   std. err.      t    P>|t|     [95% conf. interval]
   
     foreign                                                                  
   Domestic      21.88644   1.432176    15.28   0.000     19.01745    24.75544
    Foreign      14.29791   3.494918     4.09   0.000     7.296743    21.29907

For the adjusted treatment means, we add option atmeans.

. margins foreign, asbalanced atmeans

Adjusted predictions                                        Number of obs = 69
Model VCE: OLS

Expression: Linear prediction, predict()
At: rep78     (asbalanced)
    turn    = 39.7971 (mean)



                          Delta-method                                        
                   Margin   std. err.      t    P>|t|     [95% conf. interval]
   
     foreign                                                                  
   Domestic      22.02972   1.406687    15.66   0.000     19.21178    24.84765
    Foreign      14.44118   3.376853     4.28   0.000     7.676532    21.20583

Now option atmeans has a visible impact: the adjusted treatment means are different from the predicted marginal means. But why?

As we saw in section 2, when calculating margins based on a linear regression model, leaving a predictor variable “as observed” yields the same marginal mean as setting that predictor variable equal to its mean. When we specify atmeans, we are setting turn equal to its mean of 39.7971. The difference comes from the quadratic effect of turn. Because we specified c.turn#c.turn, Stata knows to take the mean of turn before squaring, setting the interaction equal to 39.7971² = 1583.8092. When we omit atmeans and treat continuous predictors as observed, the quadratic term is squared before averaging predictions. The square of the average is not the same as the average of the squares, so the margins are different.

This is one of the advantages of using factor-variable notation. If you were to manually generate the quadratic term turn*turn as a new variable and include it in the regression model, Stata's margins command wouldn't be able to recognize the relationship between turn and the quadratic term:

. generate turn2 = turn*turn

. regress mpg i.foreign i.rep78 c.turn i.foreign#c.turn i.rep78#c.turn c.turn2

(output omitted)

. margins foreign, asbalanced atmeans

Adjusted predictions                                        Number of obs = 69
Model VCE: OLS

Expression: Linear prediction, predict()
At: rep78     (asbalanced)
    turn    =  39.7971 (mean)
    turn2   = 1603.246 (mean)



                          Delta-method                                        
                   Margin   std. err.      t    P>|t|     [95% conf. interval]
   
     foreign                                                                  
   Domestic      21.88644   1.432177    15.28   0.000     19.01745    24.75544
    Foreign      14.29791   3.494918     4.09   0.000     7.296746    21.29907

Because we created turn2 as a new variable, margins sees turn2 as a regular continuous variable and sets its value equal to its mean, 1603.246, which does not equal the square of the mean of turn (39.7971² = 1583.8092). This yields margins that are numerically equivalent to the predicted marginal means we calculated by omitting atmeans.

Similarly, when it comes to interactions between two categorical variables, Stata knows how to determine the appropriate proportion for each level of the interaction when asbalanced is specified. For example, consider a linear regression of price on categorical predictors foreign and rep78, as well as their interaction. We include the condition if rep78 > 2 to restrict our analysis to the subset of data where rep78 is greater than two.

. regress mpg i.foreign i.rep78 i.foreign#i.rep78 if rep78 > 2

   Number of obs   =        59
      Source         SS           df       MS   
      F(5, 53)        =      6.10
       Model     796.45951         5  159.291902    Prob > F        =    0.0002
    Residual    1383.77778        53  26.1090147    R-squared       =    0.3653
      Adj R-squared   =    0.3054
       Total    2180.23729        58  37.5902981    Root MSE        =    5.1097




         mpg    Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
   
      foreign                                                                  
     Foreign      4.333333   3.109663     1.39   0.169    -1.903861    10.57053
                                                                               
        rep78                                                                  
           4     -.5555556   1.966724    -0.28   0.779    -4.500304    3.389193
           5            13    3.74453     3.47   0.001     5.489423    20.51058
                                                                               
foreign#rep78                                                                  
   Foreign#4            13    3.74453     3.47   0.001     5.489423    20.51058
   Foreign#5           -10   5.062165    -1.98   0.053    -20.15342    .1534172
                                                                               
        _cons           19   .9833619    19.32   0.000     17.02763    20.97237

To calculate the overall (grand) EMM, we specify margins, asbalanced to weight each factor level and interaction evenly.

. margins, asbalanced

Adjusted predictions                                        Number of obs = 59
Model VCE: OLS

Expression: Linear prediction, predict()
At: foreign   (asbalanced)
    rep78     (asbalanced)



                          Delta-method                                        
                   Margin   std. err.      t    P>|t|     [95% conf. interval]
   
       _cons           24   .9343375    25.69   0.000     22.12596    25.87404

Factor variable foreign takes two values (0 and 1), and we have restricted rep78 to three values (3, 4, and 5), so our EMM is calculated with both levels of foreign set at \(\frac{1}{2}\) and each level of rep78 set at \(\frac{1}{3}\). There are six levels of the i.foreign#i.rep78 interaction, so balancing sets each level of the interaction equal to \(\frac{1}{6}\). To calculate the EMM manually, we access the model coefficients using _b[varname] notation.

. scalar EMM = _b[_cons] + 1/2 * _b[0.foreign] + 1/2 * _b[1.foreign] 
              + 1/3 * _b[3.rep78] + 1/3 * _b[4.rep78] + 1/3 * _b[5.rep78] 
              + 1/6 * _b[0.foreign#3.rep78] + 1/6 * _b[1.foreign#3.rep78] 
              + 1/6 * _b[0.foreign#4.rep78] + 1/6 * _b[1.foreign#4.rep78] 
              + 1/6 * _b[0.foreign#5.rep78] + 1/6 * _b[1.foreign#5.rep78]

. display "EMM = " EMM
EMM = 24

4. EMMs after a model with empty cells

When we include interactions between two categorical variables and the data contain empty cells (combinations of factor levels with no observations), the EMM is not estimable. To demonstrate, we return to the model with a categorical-by-categorical interaction we saw in section 3, but we remove the restriction that rep78 be greater than two.

. regress mpg i.foreign i.rep78 i.foreign#i.rep78
note: 1.foreign#1b.rep78 identifies no observations in the sample.
note: 1.foreign#2.rep78 identifies no observations in the sample.
note: 1.foreign#5.rep78 omitted because of collinearity.

   Number of obs   =        69
      Source         SS           df       MS   
      F(7, 61)        =      4.88
       Model    839.550121         7  119.935732    Prob > F        =    0.0002
    Residual    1500.65278        61  24.6008652    R-squared       =    0.3588
      Adj R-squared   =    0.2852
       Total     2340.2029        68  34.4147485    Root MSE        =    4.9599




         mpg    Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
   
      foreign                                                                  
     Foreign     -5.666667   3.877352    -1.46   0.149    -13.41991    2.086579
                                                                               
        rep78                                                                  
           2        -1.875   3.921166    -0.48   0.634    -9.715855    5.965855
           3            -2   3.634773    -0.55   0.584    -9.268178    5.268178
           4     -2.555556   3.877352    -0.66   0.512     -10.3088     5.19769
           5            11   4.959926     2.22   0.030     1.082015    20.91798
                                                                               
foreign#rep78                                                                  
   Foreign#1             0  (empty)                                            
   Foreign#2             0  (empty)                                            
   Foreign#3            10   4.913786     2.04   0.046     .1742775    19.82572
   Foreign#4      12.11111   4.527772     2.67   0.010     3.057271    21.16495
   Foreign#5             0  (omitted)                                          
                                                                               
        _cons           21   3.507197     5.99   0.000     13.98693    28.01307

The note at the top of the output informs us that two of the combinations of rep78 and foreign identify no observations in the sample: 1.foreign#1.rep78 and 1.foreign#2.rep78 are empty cells. Nevertheless, we attempt to calculate the EMM, which assumes equal numbers of observations at each level of foreign and rep78 and at each combination of i.foreign#i.rep78.

. margins, asbalanced

Adjusted predictions                                        Number of obs = 69
Model VCE: OLS

Expression: Linear prediction, predict()
At: foreign   (asbalanced)
    rep78     (asbalanced)



                          Delta-method                                        
                   Margin   std. err.      t    P>|t|     [95% conf. interval]
   
       _cons            .  (not estimable)

In this case, it is impossible to calculate the EMM because the regression model has no parameter estimates for the empty cells. To get around this limitation, Searle, Speed, and Milliken (2000) introduced the concept of modified EMMs, where margins involving empty cells are redefined so that they become estimable. You can use the emptycells(reweight) option to implement this method, which adjusts the weights of the observed cells to account for the empty cells.

. margins, asbalanced emptycells(reweight)

Adjusted predictions                                        Number of obs = 69
Model VCE: OLS

Expression:  Linear prediction, predict()
Empty cells: reweight
At: foreign   (asbalanced)
    rep78     (asbalanced)



                          Delta-method                                        
                   Margin   std. err.      t    P>|t|     [95% conf. interval]
   
       _cons     23.01563   .8384108    27.45   0.000     21.33912    24.69213

To identify the empty cells, we create a frequency table for foreign and rep78:

. * drop value label "origin" to see that -foreign- takes values 0 and 1

. label drop origin

. table foreign rep78, nototals zerocounts



                  Repair record 1978    
                1     2      3    4    5
   
Car origin                              
  0             2     8     27    9    2
  1             0     0      3    9    9

The two levels of foreign and five levels of rep78 combine to create 10 interactions or “cells”, two of which are empty because there are no observations with 1.foreign#1.rep78 or 1.foreign#2.rep78. To calculate the modified EMM, margins ignores the two empty cells and constructs weights proportional to the number of nonempty cells. For example, foreign equals 0 in five of the eight nonempty cells, and it equals 1 in the other three nonempty cells. Option emptycells(reweight) tells margins to apply a weight of \(\frac{5}{8}\) to 0.foreign and a weight of \(\frac{3}{8}\) to 1.foreign. To calculate the modified EMM manually, we weight each factor and interaction proportional to the number of nonempty cells.

. scalar mEMM = _b[_cons] + 5/8 * _b[0.foreign] + 3/8 * _b[1.foreign]
               + 1/8 * _b[1.rep78] + 1/8 * _b[2.rep78] + 2/8 * _b[3.rep78] 
               + 2/8 * _b[4.rep78] + 2/8 * _b[5.rep78]                     
               + 1/8 * _b[0.foreign#1.rep78]                               
               + 1/8 * _b[0.foreign#2.rep78]                               
               + 1/8 * _b[0.foreign#3.rep78] + 1/8 * _b[1.foreign#3.rep78] 
               + 1/8 * _b[0.foreign#4.rep78] + 1/8 * _b[1.foreign#4.rep78] 
               + 1/8 * _b[0.foreign#5.rep78] + 1/8 * _b[1.foreign#5.rep78]

. display "Modified EMM = " mEMM
Modified EMM = 23.015625

This result is consistent with the modified EMM calculated by margins with options asbalanced and emptycells(reweight).

Reference

Searle, S. R., F. M. Speed, and G. A. Milliken. 1980. Population marginal means in the linear model: An alternative to least squares means. American Statistician 34: 216–221. https://doi.org/10.2307/2684063.

How do I calculate least-squares means in Stata?

1. EMMs after a model including only categorical covariates

2. EMMs after a model including both categorical and continuous covariates

3. EMMs after a model including categorical and continuous covariates, as well as interactions

4. EMMs after a model with empty cells

Reference

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies

Source	SS df MS
		F(5, 63) = 4.96
Model	661.189524 5 132.237905	Prob > F = 0.0007
Residual	1679.01337 63 26.6510059	R-squared = 0.2825
		Adj R-squared = 0.2256
Total	2340.2029 68 34.4147485	Root MSE = 5.1625


mpg		Coefficient Std. err. t P>\|t\| [95% conf. interval]

foreign
Foreign		3.556584 1.736681 2.05 0.045 .0861046 7.027064

rep78
2		-1.875 4.081284 -0.46 0.648 -10.0308 6.280795
3		-1.922325 3.774126 -0.51 0.612 -9.464315 5.619665
4		-1.111626 3.944633 -0.28 0.779 -8.994346 6.771095
5		3.453704 4.215132 0.82 0.416 -4.969565 11.87697

_cons		21 3.650411 5.75 0.000 13.70524 28.29476


		Delta-method
		Margin std. err. t P>\|t\| [95% conf. interval]

_cons		22.48724 .9995692 22.50 0.000 20.48976 24.48472


		Delta-method
		Margin std. err. t P>\|t\| [95% conf. interval]

_at
1		20.70895 1.049089 19.74 0.000 18.61251 22.80539
2		24.26553 1.551038 15.64 0.000 21.16603 27.36504

Source	SS df MS
		F(7, 61) = 19.90
Model	1627.48579 7 232.49797	Prob > F = 0.0000
Residual	712.717109 61 11.683887	R-squared = 0.6954
		Adj R-squared = 0.6605
Total	2340.2029 68 34.4147485	Root MSE = 3.4182

Source	SS df MS
		F(12, 56) = 9.18
Model	1551.32495 12 129.277079	Prob > F = 0.0000
Residual	788.877953 56 14.0871063	R-squared = 0.6629
		Adj R-squared = 0.5907
Total	2340.2029 68 34.4147485	Root MSE = 3.7533

Source	SS df MS
		F(5, 53) = 6.10
Model	796.45951 5 159.291902	Prob > F = 0.0002
Residual	1383.77778 53 26.1090147	R-squared = 0.3653
		Adj R-squared = 0.3054
Total	2180.23729 58 37.5902981	Root MSE = 5.1097

Source	SS df MS
		F(7, 61) = 4.88
Model	839.550121 7 119.935732	Prob > F = 0.0002
Residual	1500.65278 61 24.6008652	R-squared = 0.3588
		Adj R-squared = 0.2852
Total	2340.2029 68 34.4147485	Root MSE = 4.9599

Stata/MP4 Annual License (download)

How do I calculate least-squares means in Stata?

1. EMMs after a model including only categorical covariates

2. EMMs after a model including both categorical and continuous covariates

3. EMMs after a model including categorical and continuous covariates, as well as interactions

4. EMMs after a model with empty cells

Reference

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies