Home  /  Products  /  Features
Order

Stata features

Stata statistical software provides everything you need for data science and inference–data manipulation, exploration, visualization, statistics, reporting, and reproducibility.

By category

Linear models

regression  •  censored outcomes  •  endogenous regressors  •  bootstrap, jackknife, and robust and cluster–robust variance  •  wild cluster bootstrap  •  instrumental variables  •  three-stage least squares  •  constraints  •  quantile regression  •  GLS  •  DID  •  HDFE  •  more

Panel/longitudinal data

random and fixed effects with robust standard errors  •  HDFE  •  linear mixed models  •  random-effects probit  •  GEE  •  random- and fixed-effects Poisson  •  dynamic panel-data models  •  instrumental variables  •  DID  •  panel unit-root tests  •  more

Multilevel mixed-effects models

continuous, binary, count, and survival outcomes  •  two-, three-, and higher-level models  •  generalized linear models  •  nonlinear models  •  random intercepts  •  random slopes  •  crossed random effects  •  BLUPs of effects and fitted values  •  hierarchical models  •  residual error structures  •  DDF adjustments  •  support for survey data  •  more

Binary, count, and limited outcomes

logistic, probit, tobit  •  Poisson and negative binomial  •  conditional, multinomial, nested, ordered, rank-ordered, and stereotype logistic  •  multinomial probit  •  zero-inflated and left-truncated models  •  selection models  •  marginal effects  •  more

Choice models

discrete choice  •  rank-ordered alternatives  •  conditional logit  •  multinomial probit  •  nested logit  •  mixed logit  •  panel data  •  case-specific and alternative-specific predictors  •  interpret results—expected probabilities, covariate effects, comparisons across alternatives  •  more

Extended regression models (ERMs)

endogenous covariates  •  sample selection  •  nonrandom treatment  •  panel data  •  account for problems alone or in combination  •  continuous, interval-censored, binary, and ordinal outcomes  •  more

Generalized linear models (GLMs)

ten link functions  •  user-defined links  •  seven distributions  •  ML and IRLS estimation  •  nine variance estimators  •  seven residuals  •  more

Finite mixture models (FMMs)

fmm: prefix for 17 estimators  •  mixtures of a single estimator  •  mixtures combining multiple estimators or distributions  •  continuous, binary, count, ordinal, categorical, censored, truncated, and survival outcomes  •  more

Spatial autoregressive models

spatial lags of dependent variable, independent variables, and autoregressive errors  •  fixed and random effects in panel data  •  endogenous covariates  •  analyze spillover effects  •  more

ANOVA/MANOVA

balanced and unbalanced designs  •  factorial, nested, and mixed designs  •  repeated measures  •  marginal means  •  contrasts  •  more

Exact statistics

exact logistic and Poisson regression  •  exact case–control statistics  •  binomial tests  •  Fisher’s exact test for r × c tables  •  more

Epidemiology

standardization of rates  •  case–control  •  cohort  •  matched case–control  •  Mantel–Haenszel  •  pharmacokinetics  •  ROC analysis  •  ICD-10  •  additive models of risk  •  meta-analysis  •  more

DSGE models

specify models algebraically  •  solve models  •  estimate parameters  •  identification diagnostics  •  policy and transition matrices  •  IRFs  •  dynamic forecasts  •  Bayesian  •  more

Tests, predictions, and effects

Wald tests  •  LR tests  •  linear and nonlinear combinations  •  predictions and generalized predictions  •  marginal means  •  least-squares means  •  adjusted means  •  marginal and partial effects  •  forecast models  •  Hausman tests  •  more

Contrasts, pairwise comparisons, and margins

compare means, intercepts, or slopes  •  compare with reference category, adjacent category, grand mean, etc.  •  orthogonal polynomials  •  multiple-comparison adjustments  •  graph estimated means and contrasts  •  interaction plots  •  more

Resampling and simulation methods

bootstrap  •  jackknife  •  Monte Carlo simulation  •  permutation tests  •  exact p-values  •  more

Multivariate methods

factor analysis  •  principal components  •  discriminant analysis  •  rotation  •  multidimensional scaling  •  Procrustean analysis  •  correspondence analysis  •  biplots  •  dendrograms  •  user-extensible analyses  •  more

Cluster analysis

hierarchical clustering  •  kmeans and kmedian nonhierarchical clustering  •  dendrograms  •  stopping rules  •  user-extensible analyses  •  more

Network analysis

nwcommands: import and manipulate networks  •  generate networks  •  calculate centrality and dissimilarity measures  •  visualize networks  •  more

Time series

ARIMA  •  ARFIMA  •  ARCH/GARCH  •  VAR  •  SVAR  •  IVSVAR  •  VEC  •  multivariate GARCH  •  unobserved-components model  •  dynamic factors  •  state-space models  •  Markov-switching models  •  business calendars  •  tests for structural breaks  •  threshold regression  •  forecasts  •  impulse–response functions  •  local projections  •  unit-root tests  •  filters and smoothers  •  rolling and recursive estimation  •  Bayesian  •  more

Survival analysis

Kaplan–Meier and Nelson–Aalen estimators  •  Cox regression (frailty)  •  parametric models (frailty, random effects)  •  competing risks  •  hazards  •  time-varying covariates  •  left-, right-, and interval-censoring  •  Weibull, exponential, and Gompertz models  •  more

Bayesian analysis

thousands of built-in models  •  univariate and multivariate models  •  linear and nonlinear models  •  panel data  •  multilevel models  •  VAR  •  DSGE  •  continuous, binary, ordinal, and count outcomes  •  bayes: prefix for over 60 estimation commands  •  variable selection  •  continuous univariate, multivariate, and discrete priors  •  add your own models  •  multiple chains  •  convergence diagnostics  •  posterior summaries  •  hypothesis testing  •  model fit  •  model comparison  •  predictions  •  dynamic forecast  •  impulse–response functions  •  more

Bayesian model averaging

full enumeration  •  MC3 and MH sampling  •  three model prior classes  •  fixed and random g-priors for coefficients  •  heredity rules  •  PIP for predictors  •  model ranking by PMP  •  BMA convergence  •  variable-inclusion maps  •  model-size distribution plots  •  jointness measures  •  log predictive-score  •  predictions  •  more

Meta-analysis

effect sizes  •  common, fixed, and random effects  •  forest, funnel, and more plots  •  subgroup, leave-one-out, and cumulative analysis  •  meta-regression  •  small-study effects  •  publication bias  •  multivariate  •  multilevel  •  more

Power, precision, and sample size

power  •  sample size  •  effect size  •  minimum detectable effect  •  CI width  •  means  •  proportions  •  variances  •  correlations  •  ANOVA  •  regression  •  cluster randomized designs  •  case–control studies  •  cohort studies  •  contingency tables  •  survival analysis  •  balanced or unbalanced designs  •  results in tables or graphs  •  group sequential designs for clinical trials  •  more

Causal inference/Treatment effects

inverse probability weight (IPW)  •  doubly robust methods  •  propensity-score matching  •  regression adjustment  •  covariate matching  •  DID  •  multilevel treatments  •  endogenous treatments  •  average treatment effects (ATEs)  •  ATEs on the treated (ATET)  •  potential-outcome means (POMs)  •  continuous, binary, count, fractional, and survival outcomes  •  panel data  •  lasso  •  casual mediation analysis  •  more

Lasso

lasso  •  elastic net  •  model selection  •  prediction  •  inference  •  continuous, binary, count, and survival outcomes  •  cross-validation  •  adaptive lasso  •  double selection  •  partialing out  •  cross-fit partialing out  •  double machine learning  •  endogenous covariates  •  treatment effects  •  more

SEM (structural equation modeling)

graphical path diagram builder  •  standardized and unstandardized estimates  •  modification indices  •  direct and indirect effects  •  continuous, binary, count, ordinal, and survival outcomes  •  multilevel models  •  random slopes and intercepts  •  factor scores, empirical Bayes, and other predictions  •  groups and tests of invariance  •  goodness of fit  •  handles MAR data by FIML  •  correlated data  •  survey data  •  more

Latent class analysis

binary, ordinal, continuous, count, categorical, fractional, and survival items  •  add covariates to model class membership  •  combine with SEM path models  •  expected class proportions  •  goodness of fit  •  predictions of class membership  •  more

Multiple imputation

nine univariate imputation methods  •  multivariate normal imputation  •  chained equations  •  explore pattern of missingness  •  manage imputed datasets  •  fit model and pool results  •  transform parameters  •  joint tests of parameter estimates  •  predictions  •  more

Survey methods

multistage designs  •  bootstrap, BRR, jackknife, linearized, and SDR variance estimation  •  poststratification  •  raking  •  calibration  •  DEFF  •  predictive margins  •  means, proportions, ratios, totals  •  summary tables  •  almost all estimators supported  •  more

IRT (item response theory)

binary (1PL, 2PL, 3PL), ordinal, and categorical response models  •  item characteristic curves  •  test characteristic curves  •  item information functions  •  test information functions  •  multiple-group models  •  differential item functioning (DIF)  •  more

Data manipulation

data transformations  •  data frames  •  match-merge  •  import/export data  •  JDBC  •  ODBC  •  SQL  •  Unicode  •  by-group processing  •  append files  •  sort  •  row–column transposition  •  labeling  •  save results  •  more

Reporting

reproducible reports  •  customizable tables  •  graphical tables builder  •  Word  •  Excel  •  PDF  •  HTML  •  dynamic documents  •  Markdown  •  Stata results and graphs  •  SVG  •  EPS  •  PNG  •  TIF  •  more

Graphics

lines  •  bars  •  areas  •  ranges  •  contours  •  confidence intervals  •  interaction plots  •  survival plots  •  publication quality  •  customize anything  •  Graph Editor  •  more

Programming features

adding new commands  •  scripting  •  object-oriented programming  •  menu and dialog-box programming  •  dynamic documents  •  Markdown  •  Project Manager  •  Python integration  •  PyStata  •  Jupyter notebook  •  Java integration  •  Java plugins  •  H2O access  •  C/C++ plugins  •  more

Mata—Stata's serious programming language

interactive sessions  •  large-scale development projects  •  optimization  •  matrix inversions  •  decompositions  •  eigenvalues and eigenvectors  •  LAPACK engine  •  Intel® MKL  •  real and complex numbers  •  string matrices  •  interface to Stata datasets and matrices  •  numerical derivatives  •  object-oriented programming  •  more

Graphical user interface

menus and dialogs for all features  •  Data Editor  •  Variables Manager  •  Graph Editor  •  Project Manager  •  Do-file Editor  •  multiple preference sets  •  more

Documentation

35 manuals  •  18,000+ pages  •  seamless navigation  •  thousands of worked examples  •  quick starts  •  methods and formulas  •  references  •  more

Basic statistics

summaries  •  cross-tabulations  •  correlations  •  z and t tests  •  equality-of-variance tests  •  tests of proportions  •  confidence intervals  •  factor variables  •  more

Nonparametric methods

nonparametric regression  •  Wilcoxon–Mann–Whitney, Wilcoxon signed ranks, and Kruskal–Wallis tests  •  Cochran–Armitage and other trend tests  •  Spearman and Kendall correlations  •  Kolmogorov–Smirnov tests  •  exact binomial CIs  •  survival data  •  ROC analysis  •  smoothing  •  bootstrapping  •  more

Nonlinear regression, GMM and other systems of equations

generalized method of moments (GMM)  •  nonlinear regression  •  demand systems  •  more

Simple maximum likelihood

specify likelihood using simple expressions  •  no programming required  •  survey data  •  standard, robust, bootstrap, and jackknife SEs  •  matrix estimators  •  more

Programmable maximum likelihood

user-specified functions  •  NR, DFP, BFGS, BHHH  •  OIM, OPG, robust, bootstrap, and jackknife SEs  •  Wald tests  •  survey data  •  numeric or analytic derivatives  •  more

Other statistical methods

kappa measure of interrater agreement  •  Cronbach's alpha  •  stepwise regression  •  tests of normality  •  more

Functions

statistical  •  random-number  •  mathematical  •  string  •  date and time  •  regular expressions  •  Unicode  •  more

Internet capabilities

search and download thousands of community-contributed features (see below)  •  web updating  •  web file sharing  •  latest Stata news  •  more

Community-contributed features

search and download thousands of free additions  •  discover new features in the Stata Journal  •  share commands by posting to the SSC  •  discuss community-contributed features on Statalist  •  more

Embedded statistical computations

Numerics by Stata

Installation Qualification

IQ report for regulatory agencies such as the FDA  •  installation verification

FDA Compliance

Adherence to FDA regulatory requirement for statistical software

Accessibility

Section 508 compliance, accessibility for persons with disabilities

Sample session

A sample session of Stata for Mac, Unix, or Windows.

New in Stata 18 — Bayesian model averaging  •  Causal mediation analysis  •  Tables of descriptive statistics  •  Heterogeneous DID  •  Group sequential designs  •  Multilevel meta-analysis  •  Meta-analysis for prevalence  •  Robust inference for linear models  •  Wild cluster bootstrap  •  Local projections for IRFs  •  Flexible demand systems  •  TVCs with interval-censored Cox model  •  Lasso for Cox model  •  RERI  •  IV quantile regression  •  Alias variables across frames  •  All-new graph style  •  and more