Create a new H2O frame

Syntax

    _h2oframe create newframename [, options]
 options                        Description
 -----------------------------------------------------------------------------------
 Main
   rows(#)                      specify number of rows
   cols(#)                      specify number of columns

 Options
   realfraction(#)              specify the fraction of real columns
   intfraction(#)               specify the fraction of int columns
   catfraction(#)               specify the fraction of categorical columns
   binfraction(#)               specify the fraction of binary-valued categorical
                                  columns
   binonefraction(#)            specify the fraction of ones for
                                  binary-valued categorical columns
   timefraction(#)              specify the fraction of time columns
   strfraction(#)               specify the fraction of string columns

   missfraction(#)              specify the fraction of total entries in the frame
                                  to be missing
   realrange(#)                 specify the range of values for real columns
   intrange(#)                  specify the range of values for int columns
   factors(#)                   specify the number of factor levels in each
                                  categorical column

 Advanced
   response                     prepend an additional response column to the frame
   resfactors(#)                specify the number of factor levels in the response
                                  column
   norandomize                  specify not to generate the data values randomly
   value(#)                     specify the value for all numeric columns when
                                  norandomize is specified
   rseed(#)                     specify the random-number seed used to generate the
                                  random values
   rseedcoltype(#)              specify the random-number seed used to generate the
                                  random column types
   current                      make the new H2O frame the current (working) H2O
                                  frame
 -----------------------------------------------------------------------------------

Description

_h2oframe create creates a new H2O frame with random data. The new H2O frame may contain real, int, enum (categorical), time, and string columns. If you are not familiar with H2O frames, read What is an H2O frame?.

Options

Main

rows(#) specifies the number of rows to generate in the destination H2O frame. The default is 10,000.

cols(#) specifies the number of columns to generate in the destination H2O frame. The default is 10.

If norandomize is specified, the data values in the destination H2O frame will be equal to the value specified in value(), or they will be missing values if the missing fraction specified in missfraction() is not 0.

Options

realfraction(#) specifies the fraction of real columns in the destination H2O frame. The default is 0.5.

intfraction(#) specifies the fraction of int columns in the destination H2O frame. The default is 0.2.

catfraction(#) specifies the fraction of enum (categorical) columns in the destination H2O frame. The default is 0.2.

binfraction(#) specifies the fraction of binary-valued enum columns in the destination H2O frame. The default is 0.1.

binonefraction(#) specifies the fraction of ones in a binary-valued enum column. The default is 0.02.

strfraction(#) specifies the fraction of string columns in the destination H2O frame. The default is 0.

timefraction(#) specifies the fraction of time columns in the destination H2O frame. The default is 0.

missfraction(#) specifies the fraction of total entries in the destination H2O frame to be missing. The default is 0 if norandomize is specified and is 0.01 otherwise.

realrange(#) specifies the range of data values for all real columns. The default is 100.0, which means that all data values in real columns are between -100.0 and 100.0, inclusive.

intrange(#) specifies the range of data values for all int columns. The default is 100, which means that all data values in int columns are between -100 and 100, inclusive.

factors(#) specifies the number of factor levels in each enum column. The default is 100.

Advanced

response specifies that an additional response column be prepended to the destination H2O frame, which makes the total number of columns cols() + 1.

resfactors(#) specifies the number of factor levels in the response column added with the option response.

norandomize specifies not to randomly generate the data values in the numeric columns of the destination H2O frame.

value(#) specifies the value for the numeric columns of the destination H2O frame when norandomize is specified. The default is 0.

rseed(#) sets the random-number seed used to generate data values in the destination H2O frame. This option can be used to reproduce the data in the H2O frame.

rseedcoltype(#) sets the random-number seed used to generate column types in the destination H2O frame. This option can be used to reproduce the data in the H2O frame.

current sets the new H2O frame as the current (working) H2O frame. This is the same as typing _h2oframe change newframename after the frame is created.

Examples

 Create a new H2O frame with 10,000 rows and 10 columns
     . _h2oframe create frame1, rseed(17) rseedcoltype(17)
     . _h2oframe change frame1
     . _h2oframe describe
     . _h2oframe list in 1/10

 Same as above, but include a string column
     . _h2oframe create frame2, strfraction(0.1) rseed(17) rseedcoltype(17)
     . _h2oframe change frame2
     . _h2oframe describe
     . _h2oframe list in 1/10

 Create a new H2O frame with all real values set to 5
     . _h2oframe create frame3, norandomize value(5)
     . _h2oframe change frame3
     . _h2oframe describe
     . _h2oframe list in 1/10