Frame (sfi.Frame)¶
-
class
sfi.
Frame
¶ This class provides access to Stata frames. Functionality is provided by wrapping a Stata frame in a Python object of type
Frame
, which provides many methods for accessing the underlying Stata frame. If the underlying frame is renamed from Stata, Mata, etc., then access to the frame from its object will be lost. For more information about Stata frames, see help frames in Stata.All variable and observation numbering begins at 0. The allowed values for the variable index var and the observation index obs are
-nvar <= var < nvar
and
-nobs <= obs < nobs
Here nvar is the number of variables defined in the underlying Stata frame, which is returned by
getVarCount()
. nobs is the number of observations defined in the underlying Stata frame, which is returned bygetObsTotal()
.Negative values for var and obs are allowed and are interpreted in the usual way for Python indexing. In all functions that take var as an argument, var can be specified as either the variable index or the variable name. Note that passing the variable index will be more efficient because looking up the index for the specified variable name is avoided for each function call.
Method Summary
connect
(name)Connect to an existing frame in Stata and return a new Frame
instance that can be used to access it.create
(name)Create a new frame in Stata and return a new Frame
instance that can be used to access it.addObs
(n[, nofill])Add n observations to the frame. addVarByte
(name)Add a variable of type byte to the frame. addVarDouble
(name)Add a variable of type double to the frame. addVarFloat
(name)Add a variable of type float to the frame. addVarInt
(name)Add a variable of type int to the frame. addVarLong
(name)Add a variable of type long to the frame. addVarStr
(name, length)Add a variable of type str to the frame. addVarStrL
(name)Add a variable of type strL to the frame. allocateStrL
(sc, size[, binary])Allocate a strL so that a buffer can be stored using writeBytes()
; the contents of the strL will not be initialized.changeToCWF
()Set the Frame
as the current working frame in Stata.clone
(newName)Create a new Frame
instance by cloning the current Frame and its contents.drop
()Drop the frame in Stata. dropVar
(var)Drop the specified variables from the frame. get
([var, obs, selectvar, valuelabel, …])Read values from the frame. getAsDict
([var, obs, selectvar, valuelabel, …])Read values from the frame and store them in a dictionary. getAt
(var, obs)Read a value from the frame. getFormattedValue
(var, obs, bValueLabel)Read a value from the frame, applying its display format. getFrameAt
(index)Utility method for getting the name of a Stata frame at a given index. getFrameCount
()Utility method for getting the number of frames in Stata. getObsTotal
()Get the number of observations in the frame. getStrVarWidth
(var)Get the width of the variable of type str. getVarCount
()Get the number of variables in the frame. getVarFormat
(var)Get the format for the variable in the frame. getVarIndex
(name)Look up the variable index for the specified name in the frame. getVarLabel
(var)Get the label for the Stata variable. getVarName
(index)Get the name for the variable in the frame. getVarType
(var)Get the storage type for the variable in the frame, such as unknown, byte, int, long, float, double, strL, str18, etc. isAlias
(var)Test if a variable is an alias to a variable in another frame. isVarTypeStr
(var)Test if a variable is of type str. isVarTypeString
(var)Test if a variable is of type string. isVarTypeStrL
(var)Test if a variable is of type strL. keepVar
(var)Keep the specified variables in the frame. list
([var, obs])List values from the frame. readBytes
(sc, length)Read a sequence of bytes from a strL in the frame. rename
(newName)Rename the frame in Stata. renameVar
(var, name)Rename a variable. setObsTotal
(nobs)Set the number of observations in the frame. setVarFormat
(var, format)Set the format for a Stata variable. setVarLabel
(var, label)Set the label for a Stata variable. store
(var, obs, val[, selectvar])Store values in the frame. storeAt
(var, obs, val)Store a value in the frame. storeBytes
(sc, b, binary)Store a byte buffer to a strL in the frame. writeBytes
(sc, b[, off, length])Write length bytes from the specified byte buffer starting at offset off to a strL in the frame; the strL must be allocated using allocateStrL()
before calling this method.Method Detail
-
classmethod
connect
(name)¶ Connect to an existing frame in Stata and return a new
Frame
instance that can be used to access it.Parameters: name (str) – Name of an existing Stata frame.
Returns: A
Frame
that corresponds to the existing frame in Stata.Return type: Raises: FrameError
– This error can be raised if- the frame name does not already exist in Stata.
- Python fails to connect to the frame.
-
classmethod
create
(name)¶ Create a new frame in Stata and return a new
Frame
instance that can be used to access it.Parameters: name (str) – Name of the Stata frame to create. Returns: A new Frame
that corresponds to the new frame in Stata.Return type: Frame Raises: FrameError
– If the creation of the new frame in Stata fails.
-
addObs
(n, nofill=False)¶ Add n observations to the frame. By default, the added observations are filled with the appropriate missing-value code. If nofill is specified and equal to True, the added observations are not filled, which speeds up the process. Setting nofill to True is not recommended. If you choose this setting, it is your responsibility to ensure that the added observations are ultimately filled in or removed before control is returned to Stata.
There need not be any variables defined to add observations. If you are attempting to create a frame from nothing, you can add the observations first and then add the variables.
Parameters: - n (int) – Number of observations to add.
- nofill (bool, optional) – Do not fill the added observations. Default is False.
Raises: ValueError
– If the number of observations to add, n, exceeds the limit of observations.
-
addVarByte
(name)¶ Add a variable of type byte to the frame.
Parameters: name (str) – Name of the variable to be created. Raises: ValueError
– If name is not a valid Stata variable name.
-
addVarDouble
(name)¶ Add a variable of type double to the frame.
Parameters: name (str) – Name of the variable to be created. Raises: ValueError
– If name is not a valid Stata variable name.
-
addVarFloat
(name)¶ Add a variable of type float to the frame.
Parameters: name (str) – Name of the variable to be created. Raises: ValueError
– If name is not a valid Stata variable name.
-
addVarInt
(name)¶ Add a variable of type int to the frame.
Parameters: name (str) – Name of the variable to be created. Raises: ValueError
– If name is not a valid Stata variable name.
-
addVarLong
(name)¶ Add a variable of type long to the frame.
Parameters: name (str) – Name of the variable to be created. Raises: ValueError
– If name is not a valid Stata variable name.
-
addVarStr
(name, length)¶ Add a variable of type str to the frame.
Parameters: - name (str) – Name of the variable to be created.
- length (int) – Initial size of the variable. If the length is greater
than
Data:getMaxStrLength()
, then a variable of type strL will be created.
Raises: ValueError
– This error can be raised if- name is not a valid Stata variable name.
- length is not a positive integer.
-
addVarStrL
(name)¶ Add a variable of type strL to the frame.
Parameters: name (str) – Name of the variable to be created. Raises: ValueError
– If name is not a valid Stata variable name.
-
allocateStrL
(sc, size, binary=True)¶ Allocate a strL so that a buffer can be stored using
writeBytes()
; the contents of the strL will not be initialized.Parameters: - sc (StrLConnector) – The
StrLConnector
representing a strL. - size (int) – The size in bytes.
- binary (bool, optional) – Mark the data as binary. Note that if the data are not marked as
binary, Stata expects that the data be UTF-8 encoded. An alternate
approach is to call
storeAt()
, where the encoding is automatically handled. Default is True.
- sc (StrLConnector) – The
-
changeToCWF
()¶ Set the
Frame
as the current working frame in Stata. The current working frame in Stata can be accessed usingData
if desired.
-
clone
(newName)¶ Create a new
Frame
instance by cloning the current Frame and its contents. This results in a new frame in Stata.Parameters: newName (str) – The name of the new frame to be created. Returns: A Frame
that corresponds to the newly cloned frame in Stata.Return type: Frame Raises: FrameError
– If the cloning of the frame fails.
-
drop
()¶ Drop the frame in Stata. You may not drop a frame if it is the current working frame in Stata.
-
dropVar
(var)¶ Drop the specified variables from the frame.
Parameters: var (int, str, or list-like) – Variables to drop. It can be specified as a single variable index or name, or an iterable of variable indices or names. Raises: ValueError
– If any of the variable indices or names specified in var is out of range or not found.
-
get
(var=None, obs=None, selectvar=None, valuelabel=False, missingval=_DefaultMissing())¶ Read values from the frame.
Parameters: - var (int, str, or list-like, optional) – Variables to access. It can be specified as a single variable index or name, or an iterable of variable indices or names. If var is not specified, all the variables are specified.
- obs (int or list-like, optional) – Observations to access. It can be specified as a single observation index or an iterable of observation indices. If obs is not specified, all the observations are specified.
- selectvar (int or str, optional) – Observations for which selectvar!=0 will be selected. If selectvar is an integer, it is interpreted as a variable index. If selectvar is a string, it should contain the name of a Stata variable. Specifying selectvar as “” has the same result as not specifying selectvar, which means no observations are excluded. Specifying selectvar as -1 means that observations with missing values for the variables specified in var are to be excluded.
- valuelabel (bool, optional) – Use the value label when available. Default is False.
- missingval (_DefaultMissing, optional) – If missingval is specified, all the missing values in the returned list are replaced by this value. If it is not specified, the numeric value of the corresponding missing value in Stata is returned.
Returns: A list of lists containing the values from the frame. Each sublist contains values for one observation.
Return type: List
Raises: ValueError
– This error can be raised if
-
getAsDict
(var=None, obs=None, selectvar=None, valuelabel=False, missingval=_DefaultMissing())¶ Read values from the frame and store them in a dictionary. The keys are the variable names. The values are the data values for the corresponding variables.
Parameters: - var (int, str, or list-like, optional) – Variables to access. It can be specified as a single variable index or name, or an iterable of variable indices or names. If var is not specified, all the variables are specified.
- obs (int or list-like, optional) – Observations to access. It can be specified as a single observation index or an iterable of observation indices. If obs is not specified, all the observations are specified.
- selectvar (int or str, optional) – Observations for which selectvar!=0 will be selected. If selectvar is an integer, it is interpreted as a variable index. If selectvar is a string, it should contain the name of a Stata variable. Specifying selectvar as “” has the same result as not specifying selectvar, which means no observations are excluded. Specifying selectvar as -1 means that observations with missing values for the variables specified in var are to be excluded.
- valuelabel (bool, optional) – Use the value label when available. Default is False.
- missingval (_DefaultMissing, optional) – If missingval is specified, all the missing values in the returned dictionary are replaced by this value. If it is not specified, the numeric value of the corresponding missing value in Stata is returned.
Returns: Return a dictionary containing the data values from the frame.
Return type: dictionary
Raises: ValueError
– This error can be raised if
-
getAt
(var, obs)¶ Read a value from the frame.
Parameters: - var (int or str) – Variable to access. It can be specified as the variable index or name.
- obs (int) – Observation to access.
Returns: The value.
Return type: float or str
Raises: ValueError
– This error can be raised if
-
getFormattedValue
(var, obs, bValueLabel)¶ Read a value from the frame, applying its display format.
Parameters: - var (int or str) – Variable to access. It can be specified as the variable index or name.
- obs (int) – Observation to access.
- bValueLabel (bool) – Use the value label when available.
Returns: The formatted value as a string.
Return type: str
Raises: ValueError
– This error can be raised if
-
static
getFrameAt
(index)¶ Utility method for getting the name of a Stata frame at a given index.
Parameters: index (int) – The index for a frame. Returns: The name of the frame for the specified index. Return type: str
-
static
getFrameCount
()¶ Utility method for getting the number of frames in Stata.
Returns: The number of frames. Return type: int
-
getObsTotal
()¶ Get the number of observations in the frame.
Returns: The number of observations. Return type: int
-
getStrVarWidth
(var)¶ Get the width of the variable of type str.
Parameters: var (int or str) – Variable to access. It can be specified as the variable index or name. Returns: The width of the variable. Return type: int Raises: ValueError
– If var is out of range or not found.
-
getVarCount
()¶ Get the number of variables in the frame.
Returns: The number of variables. Return type: int
-
getVarFormat
(var)¶ Get the format for the variable in the frame.
Parameters: var (int or str) – Variable to access. It can be specified as the variable index or name. Returns: The variable format. Return type: str Raises: ValueError
– If var is out of range or not found.
-
getVarIndex
(name)¶ Look up the variable index for the specified name in the frame.
Parameters: name (str) – Variable to access. Returns: The variable index. Return type: int Raises: ValueError
– If name is not found.
-
getVarLabel
(var)¶ Get the label for the Stata variable.
Parameters: var (int or str) – Variable to access. It can be specified as the variable index or name. Returns: The variable label. Return type: str Raises: ValueError
– If var is out of range or not found.
-
getVarName
(index)¶ Get the name for the variable in the frame.
Parameters: index (int) – Variable to access. Returns: The variable name at the given index. Return type: str Raises: ValueError
– If index is out of range.
-
getVarType
(var)¶ Get the storage type for the variable in the frame, such as unknown, byte, int, long, float, double, strL, str18, etc.
Parameters: var (int or str) – Variable to access. It can be specified as the variable index or name. Returns: The variable storage type of the variable. Return type: str Raises: ValueError
– If var is out of range or not found.
-
isAlias
(var)¶ Test if a variable is an alias to a variable in another frame.
Parameters: var (int or str) – Variable to access. It can be specified as the variable index or name. Returns: True if the variable is an alias. Return type: bool Raises: ValueError
– If var is out of range or not found.
-
isVarTypeStr
(var)¶ Test if a variable is of type str.
Parameters: var (int or str) – Variable to access. It can be specified as the variable index or name. Returns: True if the variable is of type str. Return type: bool Raises: ValueError
– If var is out of range or not found.
-
isVarTypeString
(var)¶ Test if a variable is of type string.
Parameters: var (int or str) – Variable to access. It can be specified as the variable index or name. Returns: True if the variable is of type str or strL. Return type: bool Raises: ValueError
– If var is out of range or not found.
-
isVarTypeStrL
(var)¶ Test if a variable is of type strL.
Parameters: var (int or str) – Variable to access. It can be specified as the variable index or name. Returns: True if the variable is of type strL. Return type: bool Raises: ValueError
– If var is out of range or not found.
-
keepVar
(var)¶ Keep the specified variables in the frame.
Parameters: var (int, str, or list-like) – Variables to keep. It can be specified as a single variable index or name, or an iterable of variable indices or names. Raises: ValueError
– If any of the variable indices or names specified in var is out of range or not found.
-
list
(var=None, obs=None)¶ List values from the frame. The values are displayed using their corresponding variable formats.
Parameters: - var (int, str, or list-like, optional) – Variables to display. It can be specified as a single variable index or name, or an iterable of variable indices or names. If var is not specified, all the variables are specified.
- obs (int or list-like, optional) – Observations to display. It can be specified as a single observation index or an iterable of observation indices. If obs is not specified, all the observations are specified.
Raises: ValueError
– This error can be raised if
-
readBytes
(sc, length)¶ Read a sequence of bytes from a strL in the frame.
Parameters: - sc (StrLConnector) – The
StrLConnector
representing a strL. - length (int) – The number of bytes to read.
Returns: The array of bytes. An empty array of bytes is returned if there are no more data because the end has been reached.
Return type: bytes
Raises: ValueError
– If length is not a positive integer.IOError
– If failure occurred when attempting to read a sequence of bytes.
- sc (StrLConnector) – The
-
rename
(newName)¶ Rename the frame in Stata.
Parameters: newName (str) – The name of the new frame.
-
renameVar
(var, name)¶ Rename a variable.
Parameters: - var (str or int) – Name or index of the variable to rename.
- name (str) – New variable name.
Raises: ValueError
– This error can be raised if- var is not found or out of range.
- name is not a valid Stata variable name.
-
setObsTotal
(nobs)¶ Set the number of observations in the frame.
Parameters: nobs (int) – The number of observations to set. Raises: ValueError
– If the number of observations to set, nobs, exceeds the limit of observations.
-
setVarFormat
(var, format)¶ Set the format for a Stata variable.
Parameters: - var (int or str) – Index or name of the variable to format.
- format (str) – New format.
Raises: ValueError
– This error can be raised if- var is out of range or not found.
- format is not a valid Stata format.
-
setVarLabel
(var, label)¶ Set the label for a Stata variable.
Parameters: - var (int or str) – Index or name of the variable to label.
- label (str) – New label.
Raises: ValueError
– If var is out of range or not found.
-
store
(var, obs, val, selectvar=None)¶ Store values in the frame.
Parameters: - var (int, str, list-like, or None) – Variables to access. It can be specified as a single variable index or name, an iterable of variable indices or names, or None. If None is specified, all the variables are specified.
- obs (int, list-like, or None) – Observations to access. It can be specified as a single observation index, an iterable of observation indices, or None. If None is specified, all the observations are specified.
- val (array-like) – Values to store. The dimensions of val should match the dimensions implied by var and obs. Each of the values can be numeric or string based on the corresponding variable data types.
- selectvar (int or str, optional) – Only store values for observations with selectvar!=0. If selectvar is an integer, it is interpreted as a variable index. If selectvar is a string, it should contain the name of a Stata variable. Specifying selectvar as “” has the same result as not specifying selectvar, which means values are stored for all observations specified. Specifying selectvar as -1 means that observations with missing values for the variables specified in var are to be skipped.
Raises: ValueError
– This error can be raised ifTypeError
– If any of the values specified in val does not match the corresponding variable data type.
-
storeAt
(var, obs, val)¶ Store a value in the frame.
Parameters: - var (int or str) – Variable to access. It can be specified as the variable index or name.
- obs (int) – Observation to access.
- val (float or str) – Value to store. The value data type depends on the corresponding variable data type.
Raises: ValueError
– This error can be raised if
-
storeBytes
(sc, b, binary)¶ Store a byte buffer to a strL in the frame. You do not need to call
allocateStrL()
before using this method.Parameters: - sc (StrLConnector) – The
StrLConnector
representing a strL. - b (bytes or bytearray) – Bytes to store.
- binary (bool) – Mark the data as binary.
- sc (StrLConnector) – The
-
writeBytes
(sc, b, off=None, length=None)¶ Write length bytes from the specified byte buffer starting at offset off to a strL in the frame; the strL must be allocated using
allocateStrL()
before calling this method.Parameters: - sc (StrLConnector) – The
StrLConnector
representing a strL. - b (bytes or bytearray) – The buffer holding the data to store.
- off (int, optional) – The offset into the buffer. If not specified, 0 is used.
- length (int, optional) – The number of bytes to write. If not specified, the size of b is used.
Raises: ValueError
– This error can be raised if- off is negative.
- length is not a positive integer.
- sc (StrLConnector) – The
-
classmethod
Examples¶
The following provides a few quick examples illustrating how to use this class:
>>> from sfi import Frame
>>> stata: sysuse auto, clear
(1978 Automobile Data)
>>> d = Frame.connect('default')
>>> f = d.clone('myauto')
>>> Frame.getFrameCount()
2
>>> Frame.getFrameAt(0)
'default'
>>> Frame.getFrameAt(1)
'myauto'
>>> f.get(0, 0)
[[AMC Concord]]
>>> f.getAt(0, 0)
'AMC Concord'
>>> f.get(var='price')
[4099, 4749, 3799, 4816, 7827, 5788, 4453, 5189, 10372, 4082, 11385, 14500, 15906, 3299, 5705,
4504, 5104, 3667, 3955, 3984, 4010, 5886, 6342, 4389, 4187, 11497, 13594, 13466, 3829, 5379,
6165, 4516, 6303, 3291, 8814, 5172, 4733, 4890, 4181, 4195, 10371, 4647, 4425, 4482, 6486, 40
60, 5798, 4934, 5222, 4723, 4424, 4172, 9690, 6295, 9735, 6229, 4589, 5079, 8129, 4296, 5799,
4499, 3995, 12990, 3895, 3798, 5899, 3748, 5719, 7140, 5397, 4697, 6850, 11995]
>>>
>>> f.get(obs=0)
['AMC Concord', 4099, 22, 3, 2.5, 11, 2930, 186, 40, 121, 3.5799999237060547, 0]
>>>
>>> f.get([0,2,3], [0,2,4,6])
[['AMC Concord', 22, 3], ['AMC Spirit', 22, 8.98846567431158e+307], ['Buick Electra', 15, 4], ['Buick Opel', 26, 8.98846567431158e+307]]
>>>
>>> f.getVarLabel(0)
'Make and Model'
>>> f.getVarLabel('price')
'Price'
>>> f.setVarLabel(1, 'Retail Price')
>>> f.setVarLabel('mpg', 'Mileage per Gallon')
>>> f.renameVar(0, 'make2')
>>> f.renameVar('price', 'price2')
>>> f.dropVar("make2")
>>> f.dropVar("price2 mpg rep78")
>>> f.dropVar(0)
>>> f.dropVar([0,2,3])
Next we will show you a more advanced example to illustrate how to communicate
between Stata and Python using this class. Suppose we have a dataset in memory,
and we want to create a new frame in Stata that clones the variables and data
values from the dataset. Instead of using the clone()
method
above, we will create the frame from scratch using various functions in this
class.
First, we load the data containing information on various automobiles into Stata.
. webuse auto, clear
(1978 Automobile Data)
. describe
Contains data from https://www.stata-press.com/data/r16/auto.dta
obs: 74 1978 Automobile Data
vars: 12 13 Apr 2018 17:45
(_dta has notes)
--------------------------------------------------------------------------------
storage display value
variable name type format label variable label
--------------------------------------------------------------------------------
make str18 %-18s Make and Model
price int %8.0gc Price
mpg int %8.0g Mileage (mpg)
rep78 int %8.0g Repair Record 1978
headroom float %6.1f Headroom (in.)
trunk int %8.0g Trunk space (cu. ft.)
weight int %8.0gc Weight (lbs.)
length int %8.0g Length (in.)
turn int %8.0g Turn Circle (ft.)
displacement int %8.0g Displacement (cu. in.)
gear_ratio float %6.2f Gear Ratio
foreign byte %8.0g origin Car type
--------------------------------------------------------------------------------
Sorted by: foreign
Then, we write a Python script file, say, frameex.py, that creates a new empty frame named myauto in Stata, and then it clones all the variables and data values from the current dataset in memory and stores them in this frame.
import sys
from sfi import Data, Frame
# clone variables
def clone_var(f):
nvar = Data.getVarCount()
for i in range(nvar):
varname = Data.getVarName(i)
vartype = Data.getVarType(i)
if vartype=="byte":
f.addVarByte(varname)
elif vartype=="double":
f.addVarDouble(varname)
elif vartype=="float":
f.addVarFloat(varname)
elif vartype=="int":
f.addVarInt(varname)
elif vartype=="long":
f.addVarLong(varname)
elif vartype=="strL":
f.addVarStrL(varname)
else:
f.addVarStr(varname, 10)
f.setVarFormat(i, Data.getVarFormat(i))
f.setVarLabel(i, Data.getVarLabel(i))
# clone data values
def clone_data(f):
f.setObsTotal(Data.getObsTotal())
nvar = Data.getVarCount()
for i in range(nvar):
f.store(i, None, Data.get(var=i))
# create the new frame; the frame name is passed through
# the args() option of -python script-
newFrame = sys.argv[1]
fr = Frame.create(newFrame)
clone_var(fr)
clone_data(fr)
Next, we run this script file in Stata, clear the dataset in memory, and load the frame myauto into Stata as the current working dataset.
. python script frameex.py, args("myauto")
. clear
. frames change myauto
. describe
Contains data
obs: 74
vars: 12
--------------------------------------------------------------------------------
storage display value
variable name type format label variable label
--------------------------------------------------------------------------------
make str10 %-18s Make and Model
price int %8.0gc Price
mpg int %8.0g Mileage (mpg)
rep78 int %8.0g Repair Record 1978
headroom float %6.1f Headroom (in.)
trunk int %8.0g Trunk space (cu. ft.)
weight int %8.0gc Weight (lbs.)
length int %8.0g Length (in.)
turn int %8.0g Turn Circle (ft.)
displacement int %8.0g Displacement (cu. in.)
gear_ratio float %6.2f Gear Ratio
foreign byte %8.0g Car type
--------------------------------------------------------------------------------
Sorted by:
. frames change default
. frames reset