Start a new H2O cluster, or connect to an existing H2O cluster from Stata
Manipulate data (H2O frames) on the H2O cluster from Stata
Create new H2O frames
Import or upload data files to new H2O frames
Put Stata's current dataset into a new H2O frame, or load H2O frames into Stata
Split, combine, and query H2O frames
Access the capabilites of H2O using various utility commands directly from Stata
See more programming features
With the integration of H2O, you can start a new H2O cluster from Stata on your local machine through the command
. h2o init
or connect to a local or remote H2O cluster through
. h2o connect [, ip(#,#,#,#) port(#)]
Stata provides other utility commands to interact with the cluster; see Start, connect, and query an H2O cluster for details.
Once the cluster is started or connected, you can manipulate data (H2O frames) on the cluster using a suite of _h2oframe commands. For example, you can create new H2O frames; import or upload data files to new H2O frames; put Stata's current dataset into a new H2O frame; load H2O frames into Stata and save them locally; or split, combine, and query H2O frames from within Stata. You can also combine the capabilities of those _h2oframe commands with Stata's vast data management commands for more data wrangling tools. See Work with H2O frames for a complete list of commands.
Although this is still in the experimental stage for us, we want to make it available to our users to try out. On the other hand, because it is an experimental feature, syntax and features are subject to change.
In addition to the commands for connecting to an H2O cluster and working with H2O frames, StataNow also offers a suite of h2oml commands for interacting with some of H2O's machine learning methods. See [H2OML] h2oml for information on these commands.
When using Stata commands that provide access to a given feature of H2O, keep in mind that it is an H2O feature. It may have a Stata command accessing it, but what it does is up to H2O and is outside of Stata.
H20.ai. (2021) H2O: Scalable Machine Learning Platform. Version 3.46.0.6. https://github.com/h2oai/h2o-3