Stata and H2O integration¶
In this documentation, we discuss how to integrate H2O into Stata. H2O is a scalable and distributed open-source machine learning and predictive analytics platform. You can read more about H2O at http://docs.h2o.ai/.
We have been experimenting with connecting to H2O from official Stata. Typically, we keep such experiments in-house until either we fully flesh them out into something we release to users or we shelve them because we decide they do not work out the way we wanted or our priorities change.
We think H2O is an interesting platform, and we want both our users and ourselves to be able to explore connecting to it from Stata. So we are giving our users early access to our work. We welcome feedback.
The main command used to interact with H2O is _h2oframe. Notice the underscore; this signifies that this command is intended more for programmatic use. For the most part, it doesn’t return output or helpful error messages, and its syntax is intended more for programmers than end users. It can be used as an engine for wrappers that provide user-friendly output, error messages, and the like. What _h2oframe does provide is access to H2O along with returned results based on the actions that it performs.
Syntax and features are subject to change.
StataNow also offers a suite of h2oml commands for interacting with some of H2O’s machine learning methods. See [H2OML] h2oml for information on these commands.
Keep in mind that _h2oframe or h2oml is an H2O feature. Though you are accessing the feature via a Stata command, what it does is up to H2O and is outside of Stata.