← See Stata 19's new features
Highlights
Latent class model-comparison statistics
Lo–Mendell–Rubin (LMR) adjusted likelihood-ratio test
Vuong–Lo–Mendell–Rubin (VLMR) likelihood-ratio test
Information criteria: AIC, BIC, AICc, and CAIC
Latent class descriptive statistics
Number of classes
Log likelihood
Rank
Entropy
Customizable tables comparing latent class models
See more latent class analysis features
With the new lcstats postestimation command, easily compare latent class models with varying numbers of latent classes. Decide on the number of classes by using model-comparison and descriptive statistics. Construct and export publication-quality tables comparing models.
When performing latent class analysis, it is fundamental to determine the number of latent classes that best fits your data. Because there is not one superior statistic to decide which model is best for all situations, we present a set of statistics for each model.
lcstats calculates statistics for latent class models fit using fmm or gsem for model comparison. The reported table has the number of classes, estimation sample size, log likelihood, rank, entropy, and Lo–Mendell–Rubin (LMR) adjusted likelihood-ratio test. Additionally, you may report one of or all the following: the Vuong–Lo–Mendell–Rubin (VLMR) likelihood-ratio test, AIC, BIC, corrected AIC, and consistent AIC.
Below, we analyze data from Stouffer and Toby (1951) on individuals' responses to situations that require either siding with a friend (particularistic choice) or doing what is right for society (universalistic choice). There were four scenarios that people had to respond to: whether they would testify against a friend in an accident case (accident), whether they would give a negative review of a friend's play (play), whether they would disclose health concerns to a friend's insurance company (insurance), and whether they would keep a company secret that affects stock prices from a friend (stock).
We would like to identify and understand groups in the population with different patterns of responses to these questions.
Below, we fit latent class models with one, two, and three latent classes. After fitting the models, we will use lcstats to ascertain the best-fitting model. We type
. webuse gsem_lca1 (Latent class analysis) . quietly gsem (accident play insurance stock <- ), logit lclass(C 1) . estimates store lc1 . quietly gsem (accident play insurance stock <- ), logit lclass(C 2) . estimates store lc2 . quietly gsem (accident play insurance stock <- ), logit lclass(C 3) . estimates store lc3 . lcstats lc1 lc2 lc3 Latent class statistics
Classes N ll Rank Entropy df LMR P>LMR | ||
lc1 | 1 216 -543.65 4 | |
lc2 | 2 216 -504.47 9 0.7193 5 75.55 <0.001 | |
lc3 | 3 216 -503.30 14 0.6110 5 2.25 0.687 | |
Above, we stored the estimates of each model and then computed statistics for each of them. The LMR-adjusted likelihood-ratio test in the second row tests whether the one-class model fits as well as the two-class model. The LMR statistic of 75.55 and small p-value provide evidence that the two-class model fits better than the one-class model. The LMR-adjusted likelihood-ratio test on the third line does not provide evidence that the three-class model fits better than the two-class model.
We could explore the models using all the information criteria allowed by lcstats by specifying the allic option:
. lcstats lc1 lc2 lc3, allic
This will expand the table to include four new columns. Alternatively, we could split the table into two parts: one table displaying the entropy and information criteria and the other displaying the likelihood-ratio test statistics.
. lcstats lc1 lc2 lc3, allic split Latent class statistics
N Rank AIC BIC AICc CAIC Entropy | ||
lc1 | 216 4 1,095.30 1,108.80 1,095.49 1,112.80 | |
lc2 | 216 9 1,026.94 1,057.31 1,027.81 1,066.31 0.7193 | |
lc3 | 216 14 1,034.60 1,081.86 1,036.69 1,095.86 0.6110 | |
Classes ll df LMR P>LMR | ||
lc1 | 1 -543.65 | |
lc2 | 2 -504.47 5 75.55 <0.001 | |
lc3 | 3 -503.30 5 2.25 0.687 | |
All our statistics indicate that a model with two latent classes is the best fit for our data.
The lcstats command automatically stores its results in a collection, making it easy to customize and export tables with these results. For instance, say we want to export the first table we created above. After typing
. lcstats lc1 lc2 lc3
a collection named LCStats is now available.
. collect dir Collections in memory Current: LCStats
Name No. items |
LCStats 40 |
If we are happy with the layout and formatting of the table produced by our lcstats command, we can use collect export to export the table to Word, Excel, LaTeX, and various other formats. For instance,
. collect export lcstats.docx
If we instead wanted to make any changes to the layout, labels, and formatting of the table before exporting, we could use the collect suite of commands.
We can also use the collect commands to create tables that combine results from lcstats with results from other commands. Below, we illustrate how to construct a table that has latent class probabilities of our fitted models as well as the information from lcstats. Additionally, we customize our table to show only a subset of the statistics and to not show the names we used for our estimates. First, we will show the complete table, and then we will show how to create it.
Classes BIC LMR (P>LMR) Class marginal probabilities (SE) | ||
1 1,108.80 1.00 (0.00) | ||
2 1,057.31 75.55 (<0.001) 0.72 (0.06) 0.28 (0.06) | ||
3 1,081.86 2.25 (0.687) 0.16 (15.31) 0.63 (11.94) 0.21 (3.37) | ||
Above, we see the BIC and LMR that were reported by lcstats. But the table also shows class marginal probabilities of belonging to each latent class according to each model.
To create the table above, we first type
. lcstats lc1 lc2 lc3, results(k_classes bic lmr p_lmr entropy) noshownames
Then we collect marginal probabilities and standard errors from each model by using estat lcprob.
. estimates restore lc1 . estat lcprob, post . collect get _r_b _r_se, tag(estimates[lc1]) . estimates restore lc2 . estat lcprob, post . collect get _r_b _r_se, tag(estimates[lc2]) . estimates restore lc3 . estat lcprob, post . collect get _r_b _r_se, tag(estimates[lc3])
We style and format our table.
. collect composite define lmrp = lmr p_lmr . collect composite define pse = _r_b _r_se, replace . collect label levels result pse "Class marginal probabilities (SE)", modify . collect style cell result[pse], halign(left) nformat(%4.2f) . collect style cell result[p_lmr _r_se], sformat("(%s)") . collect style column, dups(center) . collect style header colname, level(hide) . collect layout (estimates) (result[k_classes bic lmrp] colname#result[pse])
Finally, we export this table to a Microsoft Word file.
. collect export lctab.docx
Stouffer, S. A., and J. Toby. 1951. Role conflict and personality. American Journal of Sociology 56: 395–406.
Read more about the new lcstats command in [SEM] gsem postestimation in the Stata Structural Equation Modeling Reference Manual and in [FMM] fmm postestimation in the Stata Finite Mixture Models Reference Manual.
See examples of the lcstats command in [FMM] Example 1a, [FMM] Example 1b, [FMM] Example 1d, [SEM] Example 51g, and [SEM] Example 52g.
Learn more about Stata's latent class analysis features.
View all the new features in Stata 19.