Home  /  Products  /  Features  /  Machine learning
Order

Machine learning

With Stata, you have access to a variety of machine learning tools—supervised and unsupervised learning, regression and classification, Bayesian approaches, causal inference, and more.

Machine learning via H2O
Stata's integration of H2O machine learning provides a powerful, scalable, and user-friendly framework for applying modern machine learning techniques. Interact with an H2O cluster seamlessly within Stata to train and evaluate predictive models efficiently. Obtain predictions and use Shapley additive values, partial dependence plots, and more to explain those predictions. You can use the h2oml commands with familiar Stata syntax or let the point-and-click Control Panel interface guide you through your end-to-end data-analysis process. See more features here.

Lasso
With Stata's lasso and elastic-net features, you can perform model selection and prediction for your continuous, binary, and count outcomes. Want to estimate effects and test coefficients? With lasso inferential methods, you can make inferences for variables of interest while lassos select control variables for you. You can even account for endogenous covariates. See more features here.

Bayesian variable selection
The bayesselect command provides a flexible Bayesian approach to variable selection by using specially designed priors for coefficients, such as global–local shrinkage and spike-and-slab priors. It accounts for model uncertainty when estimating model parameters and allows you to perform Bayesian inference for regression coefficients. bayesselect is fully integrated into Stata's Bayesian suite and works seamlessly with all Bayesian postestimation routines, such as those for Bayesian predictions and diagnostics.

Bayesian model averaging
Perform Bayesian model averaging (BMA) with the bma suite to account for model uncertainty in your analysis. Perform model choice, inference, and prediction. With BMA, you can identify influential models and important predictors. You can explore model complexity, model fit, and predictive performance. And you can perform sensitivity analysis to the assumptions about importance of models and predictors. See more features here.

Unsupervised learning
Discover unobserved groups in your data. You can use kmeans, kmedians, and hierarchical cluster analysis. And perform principal component analysis. See more features here.

Community-contributed commands
The Stata community has developed several machine learning commands that are easily downloadable. You can type search with keywords of interest to locate commands for support vector machines, neural networks, text mining, network analysis, and more.

See New in Stata 19 to learn about what was added in Stata 19.