Denominator degrees of freedom for mixed models

Order

Watch video demo

<- See Stata's other features

Highlights

Hypothesis tests and confidence intervals using t and F distributions
Five denominator-degrees-of-freedom (DDF) adjustments

Kenward—Roger
Satterthwaite
ANOVA
Repeated-measures ANOVA
Residual

Small-sample inference for linear combinations
Small-sample inference for linear hypothesis tests
Small-sample inference for contrasts

In small samples, the sampling distributions of test statistics are known to be t and F in simple cases, and those distributions can be good approximations in other cases. Stata's mixed command provides five methods for small-sample inference, also known as denominator-degrees-of-freedom (DDF) adjustments, including Satterthwaite and Kenward—Roger. In addition to adjusting the confidence intervals and significance tests reported by Stata's mixed estimation command, small-sample statistics are also provided for subsequent estimation of linear combinations and linear hypothesis tests of fixed effects.

Let's see it work

Consider a simple random-coefficient model for longitudinal data from Kenward and Roger (1997). There are 24 subjects, identified by the variable id. The subjects can be measured at any of nine time periods, but the outcome y is recorded at only three time periods for each subject, meaning that the subjects are not all seen at the same times.

To study both fixed and random effects of time, we fit the following mixed model using restricted maximum likelihood (REML) with the unstructured covariance between random effects:

. mixed y time || id: time, reml covariance(unstructured)

Performing EM optimization ...

Performing gradient-based optimization:
Iteration 0:  Log restricted-likelihood = -109.44372
Iteration 1:  Log restricted-likelihood = -109.39161
Iteration 2:  Log restricted-likelihood = -109.39153
Iteration 3:  Log restricted-likelihood = -109.39153

Computing standard errors ...

Mixed-effects REML regression                        Number of obs    =     72
Group variable: id                                   Number of groups =     24
                                                     Obs per group:
                                                                  min =      3
                                                                  avg =    3.0
                                                                  max =      3
                                                     Wald chi2(1)     =   4.34
Log restricted-likelihood = -109.39153               Prob > chi2      = 0.0372



           y   Coefficient  Std. err.      z    P>|z|     [95% conf. interval]

        time     .2765987   .1327319     2.08   0.037     .0164489    .5367485
       _cons     1.045034   .2504823     4.17   0.000     .5540973     1.53597





  Random-effects parameters      Estimate   Std. err.     [95% conf. interval]

id: Unstructured              
                   var(time)     .3259698   .1356851      .1441665     .737039
                  var(_cons)     .4172514   .3432177      .0832198    2.092036
             cov(time,_cons)    -.1491218   .1736941      -.489556    .1913124

               var(Residual)     .3407946   .0844243      .2097135    .5538077

LR test vs. linear model: chi2(3) = 84.07                 Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.

Our default large-sample inference for time suggests that the fixed time effect is significant at a 5% level (p-value=0.037). Empirical evidence suggests, however, that in small samples, the normal and chi-squared distributions may provide poor approximations to the unknown distributions of the test statistics and may lead to anticonservative results.

In Stata, we can account for small samples by specifying one of the five DDF methods. We use the Kenward—Roger method in this example.

. mixed y time || id: time, reml covariance(unstructured) dfmethod(kroger)

Performing EM optimization ...

Performing gradient-based optimization:
Iteration 0:  Log restricted-likelihood = -109.44372
Iteration 1:  Log restricted-likelihood = -109.39161
Iteration 2:  Log restricted-likelihood = -109.39153
Iteration 3:  Log restricted-likelihood = -109.39153

Computing standard errors ...

Computing degrees of freedom ...

Mixed-effects REML regression                        Number of obs    =     72
Group variable: id                                   Number of groups =     24
                                                     Obs per group:
                                                                  min =      3
                                                                  avg =    3.0
                                                                  max =      3
DF method: Kenward—Roger                             DF:          min =  11.68
                                                                  avg =  17.19
                                                                  max =  22.69
                                                     F(1, 22.69)      =   4.24
Log restricted-likelihood = -109.39153               Prob > F         = 0.0512



           y   Coefficient  Std. err.      t    P>|t|     [95% conf. interval]

        time     .2765987     .13434     2.06   0.051    -.0015158    .5547132
       _cons     1.045034   .2700712     3.87   0.002     .4548251    1.635242




  Random-effects parameters      Estimate   Std. err.     [95% conf. interval]

id: Unstructured              
                   var(time)     .3259698   .1356851      .1441665     .737039
                  var(_cons)     .4172514   .3432177      .0832198    2.092036
             cov(time,_cons)    -.1491218   .1736941      -.489556    .1913124

               var(Residual)     .3407946   .0844243      .2097135    .5538077

LR test vs. linear model: chi2(3) = 84.07                 Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.

After adjusting for a small sample, we do not have sufficient evidence to reject the null hypothesis of no time effect, at least at a 5% significance level.

Our follow-up analyses can also account for small samples, for example, when computing linear combinations,

. lincom _b[_cons] + _b[time], small

 ( 1)  [y]time + [y]_cons = 0



           y   Coefficient  Std. err.      t    P>|t|     [95% conf. interval]

         (1)     1.321632   .2292508     5.77   0.000     .8235855    1.819679

and when performing linear hypothesis tests,

. test (_b[_cons]=1) (_b[time]==0), small

 ( 1)  [y]_cons = 1
 ( 2)  [y]time = 0

       F(  2, 15.60) =    3.05
            Prob > F =    0.0764

Reference

Kenward, M.G., and J.H. Roger. 1997. Small sample inference for fixed effects from restricted maximum likelihood. Biometrics 53: 983-997.

Tell me more

Read more about small-sample adjustments in the Stata Multilevel Mixed-Effects Reference Manual, see [ME] Mixed

Products

New in Stata 19

Why Stata

All features

Disciplines

Stata/MP

StataNow

Order Stata

Purchase

Order Stata

Bookstore

Stata Press

Stata Journal

Gift Shop

Learn

Free webinars

NetCourses

Classroom and web training

Organizational training

Video tutorials

Third-party courses

Web resources

Teaching with Stata

Support

Training

Video tutorials

FAQs

Statalist: The Stata Forum

Resources

Technical support

Customer service

Alerts

Company

News and events

Customer service

Careers

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Cookie Settings

Privacy policy

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Required cookies

Advertising cookies

Required cookies

These cookies are essential for our website to function and do not store any personally identifiable information. These cookies cannot be disabled.
Advertising and performance cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.

Accept Cookies


y		Coefficient Std. err. z P>\|z\| [95% conf. interval]

time		.2765987 .1327319 2.08 0.037 .0164489 .5367485
_cons		1.045034 .2504823 4.17 0.000 .5540973 1.53597


Random-effects parameters		Estimate Std. err. [95% conf. interval]

id: Unstructured
var(time)		.3259698 .1356851 .1441665 .737039
var(_cons)		.4172514 .3432177 .0832198 2.092036
cov(time,_cons)		-.1491218 .1736941 -.489556 .1913124

var(Residual)		.3407946 .0844243 .2097135 .5538077


y		Coefficient Std. err. t P>\|t\| [95% conf. interval]

time		.2765987 .13434 2.06 0.051 -.0015158 .5547132
_cons		1.045034 .2700712 3.87 0.002 .4548251 1.635242


y		Coefficient Std. err. t P>\|t\| [95% conf. interval]

(1)		1.321632 .2292508 5.77 0.000 .8235855 1.819679