Home  /  Products  /  Stata 18  /  Meta-analysis for prevalence

<- See Stata 18's new features

Highlights

  • Effect size

    • Freeman–Tukey transformed proportion

    • Logit-transformed proportion

    • Raw (untransformed) proportion

  • Five types of study confidence intervals

  • Multiple methods to back-transform results into proportions

  • Scaled proportions

  • Full support of meta-analysis features

You asked, we listened! The meta suite now supports meta-analysis (MA) of one proportion, or prevalence. Multiple types of effect sizes, confidence intervals, and back-transformations are supported. All standard meta-analysis features such as forest plots and subgroup analysis are supported.

The traditional MA deals with two-sample binary or continuous data where the outcome of interest is measured across two groups typically labeled as the treatment and control groups. For example, an MA may compare the risk of contracting a disease (binary outcome) across two groups: the vaccinated and unvaccinated. Or maybe we want to contrast weight loss (continuous outcome) between two groups of subjects that followed different diets, say, keto versus intermittent fasting.

This two-group setting, however, is not always present in an MA. For example, the United Nations may conduct an MA to evaluate the prevalence of a certain disease across countries to allocate the proper resources to combat it. Or maybe the Department of Education performs an MA to assess the proportion of high school dropouts and uses its results to guide the budget for K–12 education. In both examples, we have one-sample binary data, in which the subjects belong to a single group and the interest lies in the proportion of individuals that experienced a certain event (contracting the disease in the first example and dropping out of high school in the second). In this setting, effect sizes such as Freeman–Tukey transformed proportions or logit-transformed proportions are typically used in the MA.

Let's see it work

Example dataset: Proportions of vegetarians across the seven regions of the U.S.

Meeting your future in-laws for the first time can be nerve-racking. You decided to impress your future mother-in-law, who plans on opening an online restaurant that delivers food across the United States. Being the statistician in the family, you suggested conducting an MA to assess the overall proportion of vegetarians (and vegans) across the seven regions of the U.S. Guided by the results of the MA, you hope to help the restaurant tailor more vegetarian-friendly recipes to specific regions of the U.S. For simplicity, assume you identified one study in each region to include in the MA.

. describe

Contains data from vegetprop.dta    
 Observations:             7                  Fictional data of proportions of
                                                vegetarians across the 7 regions
			       		        of the U.S.     
    Variables:             6                  24 Apr 2023 10:43
Variable Storage Display Value
name type format label Variable label
studylbl str21 %21s Study label
region str15 %15s U.S. Region
poptotal float %9.0g total population (in millions)
ntotal float %9.0g Within-study sample size
nveget inte %9.0g Number of vegetarians
restaurant byte %9.0g No. of vegan and vegetarian restaurants per million people
Meta-analysis of one-sample binary data

Variables nveget and ntotal represent the number of vegetarians and the total number of subjects in each study. By default, meta esize computes the Freeman–Tukey double-arcsine-transformed proportion for each study. This is a variance-stabilizing transformation and is particularly preferable when the proportions are close to 0 or 1.

Declare your data as meta data via meta esize
. meta esize nveget ntotal, studylabel(studylbl)

Meta-analysis setting information

 Study information
    No. of studies: 7
       Study label: studylbl
        Study size: _meta_studysize
      Summary data: nveget ntotal

       Effect size
              Type: ftukeyprop
             Label: Freeman–Tukey's p
          Variable: _meta_es

         Precision
         Std. err.: _meta_se
                CI: [_meta_cil, _meta_ciu]
          CI level: 95%

  Model and method
             Model: Random effects
            Method: REML

You may specify the logit-transformed proportion as the effect size using option esize(logitprop). Because the variance of the logit-transformed proportion depends on the proportion itself, the MA of this effect size tends to assign artificially low weights for studies with proportions close to 0 or 1.

. meta update, esize(logitprop) 
-> meta esize nveget ntotal , esize(logitprop) studylabel(studylbl)

Meta-analysis setting information from meta esize

 Study information
    No. of studies: 7
       Study label: studylbl
        Study size: _meta_studysize
      Summary data: nveget ntotal

       Effect size
              Type: logitprop
             Label: Logit proportion
          Variable: _meta_es
   Zero-cells adj.: None; no zero cells

         Precision
         Std. err.: _meta_se
                CI: [_meta_cil, _meta_ciu]
          CI level: 95%

  Model and method
             Model: Random effects
            Method: REML

Perhaps you may wish to compute the untransformed proportions; however, this is recommended only if all the proportions reported by the studies are close to 0.5, which is not common.

. meta update, esize(proportion)
(Output omitted)
Forest plots and other meta-analysis techniques

Continuing with the first specification of meta esize, after computing the effect size of interest and declaring your data as meta data, you may use any MA technique. For example, to construct a forest plot, we type

. meta forest, proportion

The proportion option specifies that the results be reported as proportions instead of the default Freeman–Tukey transformed proportions. This is equivalent to applying the inverse Freeman–Tukey transformation using option transform(invftukey). The overall (mean) proportion of vegetarians is 0.06 with a CI of [0.04, 0.08].

You may also report your results as the number of vegetarians per 1,000 persons, say, using suboption scale() of option transform(). We will also show the corresponding region (variable region) of each study on the forest plot.

. meta forest _id _data region _plot _esci _weight,
  transform(invftukey, scale(1000)) esrefline insidemarker

The above forest plots reveal substantive differences among the proportions of vegetarians, with higher prevalence of vegetarians in the Pacific Coastal, New England, and Mid-Atlantic regions compared with the rest of the U.S. regions.

The meeting with the in-laws is around the corner. Luckily for you, backed with the above forest plot, you may advise your future mother-in-law to incorporate more vegetarian recipes on her menu for the aforementioned regions and color her impressed!

Tell me more

Learn more about other new features in meta-analysis.

Read more about meta-analysis in the Stata Meta-Analysis Reference Manual; see [META] meta.

View all the new features in Stata 18.

Made for data science.

Get started today.