Latent class model-comparison statistics

← See Stata 19's new features

Highlights

Latent class model-comparison statistics

Lo–Mendell–Rubin (LMR) adjusted likelihood-ratio test
Vuong–Lo–Mendell–Rubin (VLMR) likelihood-ratio test
Information criteria: AIC, BIC, AICc, and CAIC

Latent class descriptive statistics

Number of classes
Log likelihood
Rank
Entropy

Customizable tables comparing latent class models
See more latent class analysis features

With the new lcstats postestimation command, easily compare latent class models with varying numbers of latent classes. Decide on the number of classes by using model-comparison and descriptive statistics. Construct and export publication-quality tables comparing models.

When performing latent class analysis, it is fundamental to determine the number of latent classes that best fits your data. Because there is not one superior statistic to decide which model is best for all situations, we present a set of statistics for each model.

lcstats calculates statistics for latent class models fit using fmm or gsem for model comparison. The reported table has the number of classes, estimation sample size, log likelihood, rank, entropy, and Lo–Mendell–Rubin (LMR) adjusted likelihood-ratio test. Additionally, you may report one of or all the following: the Vuong–Lo–Mendell–Rubin (VLMR) likelihood-ratio test, AIC, BIC, corrected AIC, and consistent AIC.

Let's see it work

-> Latent class model-comparison statistics

-> Customizable tables

Latent class model-comparison statistics

Below, we analyze data from Stouffer and Toby (1951) on individuals' responses to situations that require either siding with a friend (particularistic choice) or doing what is right for society (universalistic choice). There were four scenarios that people had to respond to: whether they would testify against a friend in an accident case (accident), whether they would give a negative review of a friend's play (play), whether they would disclose health concerns to a friend's insurance company (insurance), and whether they would keep a company secret that affects stock prices from a friend (stock).

We would like to identify and understand groups in the population with different patterns of responses to these questions.

Below, we fit latent class models with one, two, and three latent classes. After fitting the models, we will use lcstats to ascertain the best-fitting model. We type

. webuse gsem_lca1
(Latent class analysis)

. quietly gsem (accident play insurance stock <- ), logit lclass(C 1)

. estimates store lc1

. quietly gsem (accident play insurance stock <- ), logit lclass(C 2)

. estimates store lc2

. quietly gsem (accident play insurance stock <- ), logit lclass(C 3)

. estimates store lc3

. lcstats lc1 lc2 lc3

Latent class statistics



       Classes     N        ll   Rank   Entropy   df     LMR    P>LMR
   
lc1          1   216   -543.65      4                                
lc2          2   216   -504.47      9    0.7193    5   75.55   <0.001
lc3          3   216   -503.30     14    0.6110    5    2.25    0.687

LMR is the Lo–Mendell–Rubin-adjusted likelihood-ratio test statistic.
Likelihood-ratio tests compare the given model versus the same model
with one less latent class.

Above, we stored the estimates of each model and then computed statistics for each of them. The LMR-adjusted likelihood-ratio test in the second row tests whether the one-class model fits as well as the two-class model. The LMR statistic of 75.55 and small p-value provide evidence that the two-class model fits better than the one-class model. The LMR-adjusted likelihood-ratio test on the third line does not provide evidence that the three-class model fits better than the two-class model.

We could explore the models using all the information criteria allowed by lcstats by specifying the allic option:

. lcstats lc1 lc2 lc3, allic

This will expand the table to include four new columns. Alternatively, we could split the table into two parts: one table displaying the entropy and information criteria and the other displaying the likelihood-ratio test statistics.

. lcstats lc1 lc2 lc3, allic split

Latent class statistics



         N   Rank        AIC        BIC       AICc       CAIC   Entropy
   
lc1    216      4   1,095.30   1,108.80   1,095.49   1,112.80          
lc2    216      9   1,026.94   1,057.31   1,027.81   1,066.31    0.7193
lc3    216     14   1,034.60   1,081.86   1,036.69   1,095.86    0.6110

AIC is the Akaike information criterion.
BIC is the Bayesian information criterion.
AICc is the corrected Akaike information criterion.
CAIC is the consistent Akaike information criterion.
BIC, AICc, and CAIC use N = number of observations.



       Classes        ll   df     LMR    P>LMR
   
lc1          1   -543.65                      
lc2          2   -504.47    5   75.55   <0.001
lc3          3   -503.30    5    2.25    0.687

LMR is the Lo–Mendell–Rubin-adjusted
likelihood-ratio test statistic.
Likelihood-ratio tests compare the given model
versus the same model with one less latent
class.

All our statistics indicate that a model with two latent classes is the best fit for our data.

Customizable tables

The lcstats command automatically stores its results in a collection, making it easy to customize and export tables with these results. For instance, say we want to export the first table we created above. After typing

. lcstats lc1 lc2 lc3

a collection named LCStats is now available.

. collect dir

Collections in memory
Current: LCStats

 
   Name   No. items
 
LCStats   40

If we are happy with the layout and formatting of the table produced by our lcstats command, we can use collect export to export the table to Word, Excel, LaTeX, and various other formats. For instance,

. collect export lcstats.docx

If we instead wanted to make any changes to the layout, labels, and formatting of the table before exporting, we could use the collect suite of commands.

We can also use the collect commands to create tables that combine results from lcstats with results from other commands. Below, we illustrate how to construct a table that has latent class probabilities of our fitted models as well as the information from lcstats. Additionally, we customize our table to show only a subset of the statistics and to not show the names we used for our estimates. First, we will show the complete table, and then we will show how to create it.



Classes        BIC      LMR (P>LMR)       Class marginal probabilities (SE)

      1   1,108.80                    1.00 (0.00)
      2   1,057.31   75.55 (<0.001)   0.72 (0.06)    0.28 (0.06)
      3   1,081.86     2.25 (0.687)   0.16 (15.31)   0.63 (11.94)   0.21 (3.37)

Above, we see the BIC and LMR that were reported by lcstats. But the table also shows class marginal probabilities of belonging to each latent class according to each model.

To create the table above, we first type

. lcstats lc1 lc2 lc3, results(k_classes bic lmr p_lmr entropy) noshownames

Then we collect marginal probabilities and standard errors from each model by using estat lcprob.

. estimates restore lc1
. estat lcprob, post
. collect get _r_b _r_se, tag(estimates[lc1])
. estimates restore lc2
. estat lcprob, post
. collect get _r_b _r_se, tag(estimates[lc2])
. estimates restore lc3
. estat lcprob, post
. collect get _r_b _r_se, tag(estimates[lc3])

We style and format our table.

. collect composite define lmrp = lmr p_lmr
. collect composite define pse = _r_b _r_se, replace
. collect label levels result pse "Class marginal probabilities (SE)", modify
. collect style cell result[pse], halign(left) nformat(%4.2f)
. collect style cell result[p_lmr _r_se], sformat("(%s)")
. collect style column, dups(center)
. collect style header colname, level(hide)
. collect layout (estimates) (result[k_classes bic lmrp] colname#result[pse])

Finally, we export this table to a Microsoft Word file.

. collect export lctab.docx

Reference

Stouffer, S. A., and J. Toby. 1951. Role conflict and personality. American Journal of Sociology 56: 395–406.

Tell me more

Read more about the new lcstats command in [SEM] gsem postestimation in the Stata Structural Equation Modeling Reference Manual and in [FMM] fmm postestimation in the Stata Finite Mixture Models Reference Manual.

See examples of the lcstats command in [FMM] Example 1a, [FMM] Example 1b, [FMM] Example 1d, [SEM] Example 51g, and [SEM] Example 52g.

Learn more about Stata's latent class analysis features.

View all the new features in Stata 19.

Ready to get started?

Experience powerful statistical tools, reproducible workflows, and a seamless user experience—all in one trusted platform.