By category
Linear models
regression •
censored outcomes •
endogenous regressors •
bootstrap,
jackknife,
and robust and cluster–robust variance •
wild cluster bootstrap •
instrumental variables •
three-stage least squares •
constraints •
quantile regression •
GLS •
DID •
HDFE •
more |
Panel/longitudinal data
random and fixed effects with robust standard errors •
HDFE •
linear mixed models •
random-effects probit •
GEE •
random- and fixed-effects Poisson •
dynamic panel-data models •
instrumental variables •
DID •
panel unit-root tests •
more |
Multilevel mixed-effects models
continuous, binary, count, and survival outcomes •
two-, three-, and higher-level models •
generalized linear models •
nonlinear models •
random intercepts •
random slopes •
crossed random effects •
BLUPs of effects and fitted values •
hierarchical models •
residual error structures •
DDF adjustments •
support for survey data •
more |
Binary, count, and limited outcomes
logistic,
probit,
tobit •
Poisson and negative binomial •
conditional,
multinomial,
nested,
ordered,
rank-ordered,
and stereotype logistic •
multinomial probit •
zero-inflated and left-truncated models •
selection models •
marginal effects •
more |
Choice models
discrete choice •
rank-ordered alternatives •
conditional logit •
multinomial probit •
nested logit •
mixed logit •
panel data •
case-specific and alternative-specific predictors •
interpret results—expected probabilities, covariate effects, comparisons across alternatives •
more |
Extended regression models (ERMs)
endogenous covariates •
sample selection • nonrandom treatment •
panel data • account for problems alone or in
combination •
continuous, interval-censored, binary, and ordinal outcomes •
more |
Generalized linear models (GLMs)
ten link functions •
user-defined links •
seven distributions •
ML and IRLS estimation •
nine variance estimators •
seven residuals •
more |
Finite mixture models (FMMs)
fmm: prefix for 17 estimators •
mixtures of a single estimator •
mixtures combining multiple estimators or distributions •
continuous, binary, count, ordinal, categorical, censored,
truncated, and survival outcomes •
more |
Spatial autoregressive models
spatial lags of dependent variable, independent
variables, and autoregressive errors •
fixed and random effects in panel data •
endogenous covariates •
analyze spillover effects •
more |
ANOVA/MANOVA
balanced and unbalanced designs •
factorial, nested, and mixed designs •
repeated measures •
marginal means •
contrasts •
more |
Exact statistics
exact logistic and Poisson regression •
exact case–control statistics •
binomial tests •
Fisher’s exact test for r × c tables •
more |
Epidemiology
standardization of rates •
case–control •
cohort •
matched case–control •
Mantel–Haenszel •
pharmacokinetics •
ROC analysis •
ICD-10 •
additive models of risk •
meta-analysis •
more |
DSGE models
specify models algebraically •
solve models •
estimate parameters •
identification diagnostics •
policy and transition matrices •
IRFs •
dynamic forecasts •
Bayesian •
more |
Tests, predictions, and effects
Wald tests •
LR tests •
linear and nonlinear combinations •
predictions and generalized predictions •
marginal means •
least-squares means •
adjusted means •
marginal and partial effects •
forecast models •
Hausman tests •
more |
Contrasts, pairwise comparisons, and margins
compare means,
intercepts,
or slopes •
compare with reference category,
adjacent category,
grand mean, etc. •
orthogonal polynomials •
multiple-comparison adjustments •
graph estimated means and contrasts •
interaction plots •
more |
Resampling and simulation methods
bootstrap •
jackknife •
Monte Carlo simulation •
permutation tests •
exact p-values •
more |
Multivariate methods
factor analysis •
principal components •
discriminant analysis •
rotation •
multidimensional scaling •
Procrustean analysis •
correspondence analysis •
biplots •
dendrograms •
user-extensible analyses •
more |
Cluster analysis
hierarchical clustering •
kmeans and kmedian nonhierarchical clustering •
dendrograms •
stopping rules •
user-extensible analyses •
more |
Network analysis
nwcommands: import and manipulate networks •
generate networks •
calculate centrality and dissimilarity measures •
visualize networks •
more |
Time series
ARIMA •
ARFIMA •
ARCH/GARCH •
VAR •
SVAR •
IVSVAR •
VEC •
multivariate GARCH •
unobserved-components model •
dynamic factors •
state-space models •
Markov-switching models •
business calendars •
tests for structural breaks •
threshold regression •
forecasts •
impulse–response functions •
local projections •
unit-root tests •
filters and smoothers •
rolling and recursive estimation •
Bayesian •
more |
Survival analysis
Kaplan–Meier and
Nelson–Aalen estimators •
Cox regression (frailty) •
parametric models (frailty, random effects) •
competing risks •
hazards •
time-varying covariates •
left-, right-, and interval-censoring •
Weibull,
exponential,
and Gompertz models •
more |
Bayesian analysis
thousands of built-in models •
univariate and multivariate models •
linear and nonlinear models •
panel data •
multilevel models •
VAR •
DSGE •
continuous, binary, ordinal, and count outcomes •
bayes: prefix for over 60 estimation commands •
variable selection •
continuous univariate, multivariate, and discrete priors •
add your own models •
multiple chains •
convergence diagnostics •
posterior summaries •
hypothesis testing •
model fit •
model comparison •
predictions •
dynamic forecast •
impulse–response functions •
more |
Bayesian model averaging
full enumeration •
MC3 and MH sampling •
three model prior classes •
fixed and random g-priors for coefficients •
heredity rules •
PIP for predictors •
model ranking by PMP •
BMA convergence •
variable-inclusion maps •
model-size distribution plots •
jointness measures •
log predictive-score •
predictions •
more |
Meta-analysis
effect sizes •
common, fixed, and random effects •
forest, funnel, and more plots •
subgroup, leave-one-out, and cumulative analysis •
meta-regression •
small-study effects •
publication bias •
multivariate •
multilevel •
more |
Power, precision, and sample size
power •
sample size •
effect size •
minimum detectable effect •
CI width •
means •
proportions •
variances •
correlations •
ANOVA •
regression •
cluster randomized designs •
case–control studies •
cohort studies •
contingency tables •
survival analysis •
balanced or unbalanced designs •
results in tables or graphs •
group sequential designs for clinical trials •
more |
Causal inference/Treatment effects
inverse probability weight (IPW) •
doubly robust methods •
propensity-score matching •
regression adjustment •
covariate matching •
DID •
multilevel treatments •
endogenous treatments •
average treatment effects (ATEs) •
ATEs on the treated (ATET) •
potential-outcome means (POMs) •
continuous, binary, count, fractional, and survival outcomes •
panel data •
lasso •
casual mediation analysis •
more |
Lasso
lasso •
elastic net •
model selection •
prediction •
inference •
continuous, binary, count, and survival outcomes •
cross-validation •
adaptive lasso •
double selection •
partialing out •
cross-fit partialing out •
double machine learning •
endogenous covariates •
treatment effects •
more |
SEM (structural equation modeling)
graphical path diagram builder •
standardized and unstandardized
estimates •
modification indices •
direct and indirect effects •
continuous, binary, count, ordinal, and survival outcomes •
multilevel models •
random slopes and intercepts •
factor scores, empirical Bayes, and other predictions •
groups and tests of invariance •
goodness of fit •
handles MAR data by FIML •
correlated data •
survey data •
more |
Latent class analysis
binary, ordinal, continuous, count, categorical, fractional, and survival items •
add covariates to model class membership •
combine with SEM path models •
expected class proportions •
goodness of fit •
predictions of class membership •
more |
Multiple imputation
nine univariate imputation methods •
multivariate normal imputation •
chained equations •
explore pattern of missingness •
manage imputed datasets •
fit model and pool results •
transform parameters •
joint tests of parameter estimates •
predictions •
more |
Survey methods
multistage designs •
bootstrap,
BRR,
jackknife,
linearized, and
SDR variance estimation •
poststratification •
raking •
calibration •
DEFF •
predictive margins •
means,
proportions,
ratios,
totals •
summary tables •
almost all estimators supported •
more |
IRT (item response theory)
binary (1PL, 2PL, 3PL), ordinal, and categorical response models •
item characteristic curves •
test characteristic curves •
item information functions •
test information functions •
multiple-group models •
differential item functioning (DIF) • more |
Data manipulation
data transformations •
data frames •
match-merge •
import/export data •
JDBC •
ODBC •
SQL •
Unicode •
by-group processing •
append files •
sort •
row–column transposition •
labeling •
save results •
more |
Reporting
reproducible reports •
customizable tables •
graphical tables builder •
Word •
Excel •
PDF •
HTML •
dynamic documents •
Markdown •
Stata results and graphs •
SVG •
EPS •
PNG •
TIF •
more |
Graphics
lines •
bars •
areas •
ranges •
contours •
confidence intervals •
interaction plots •
survival plots •
publication quality •
customize anything •
Graph Editor •
more |
Programming features
adding new commands •
scripting •
object-oriented programming •
menu and dialog-box programming •
dynamic documents •
Markdown •
Project Manager •
Python integration •
PyStata •
Jupyter notebook •
Java integration •
Java plugins •
H2O access •
C/C++ plugins •
more |
Mata—Stata's serious programming language
interactive sessions •
large-scale development projects •
optimization •
matrix inversions •
decompositions •
eigenvalues and eigenvectors •
LAPACK engine •
Intel® MKL •
real and complex numbers •
string matrices •
interface to Stata datasets and matrices •
numerical derivatives •
object-oriented programming •
more |
Graphical user interface
menus and dialogs for all features •
Data Editor •
Variables Manager •
Graph Editor •
Project Manager •
Do-file Editor •
multiple preference sets •
more |
Documentation
35 manuals •
18,000+ pages •
seamless navigation •
thousands of worked examples •
quick starts •
methods and formulas •
references •
more |
Basic statistics
summaries •
cross-tabulations •
correlations •
z and t tests •
equality-of-variance tests •
tests of proportions •
confidence intervals •
factor variables •
more |
Nonparametric methods
nonparametric regression •
Wilcoxon–Mann–Whitney,
Wilcoxon signed ranks, and Kruskal–Wallis tests •
Cochran–Armitage and other trend tests •
Spearman and Kendall correlations •
Kolmogorov–Smirnov tests •
exact binomial CIs •
survival data •
ROC analysis •
smoothing •
bootstrapping •
more |
Nonlinear regression, GMM and other systems of equations
generalized method of moments (GMM) •
nonlinear regression •
demand systems •
more
|
Simple maximum likelihood
specify likelihood using simple expressions •
no programming required •
survey data •
standard, robust, bootstrap, and jackknife SEs •
matrix estimators •
more |
Programmable maximum likelihood
user-specified functions •
NR, DFP, BFGS, BHHH •
OIM, OPG, robust, bootstrap, and jackknife SEs •
Wald tests •
survey data •
numeric or analytic derivatives •
more |
Other statistical methods
kappa measure of interrater agreement •
Cronbach's alpha •
stepwise regression •
tests of normality •
more
|
Functions
statistical •
random-number •
mathematical •
string •
date and time •
regular expressions •
Unicode •
more
|
Internet capabilities
search and download thousands of community-contributed features (see below) •
web updating •
web file sharing •
latest Stata news •
more |
Community-contributed features
search and download thousands of free additions •
discover new features in the Stata Journal •
share commands by posting to the SSC •
discuss community-contributed features on Statalist •
more |
Embedded statistical computations
Installation Qualification
IQ report for regulatory agencies such as the FDA •
installation verification
|
FDA Compliance
Adherence to FDA regulatory requirement for statistical software
|
Accessibility
Section 508 compliance,
accessibility for persons with disabilities |
Sample session
New in Stata 18 —
Bayesian model averaging •
Causal mediation analysis •
Tables of descriptive statistics •
Heterogeneous DID •
Group sequential designs •
Multilevel meta-analysis •
Meta-analysis for prevalence •
Robust inference for linear models •
Wild cluster bootstrap •
Local projections for IRFs •
Flexible demand systems •
TVCs with interval-censored Cox model •
Lasso for Cox model •
RERI •
IV quantile regression •
Alias variables across frames •
All-new graph style •
and more