Call Stata using API functions¶
You can also interact with Stata by using the config and stata modules from the pystata Python package. The config module defines functions for initializing and configuring Stata. The stata module defines functions for interacting with Stata. For more information about these two modules, see API functions.
Previously, we initialized Stata’s environment within Python. Once we have done that, we can use the stata module to call Stata. Below, we import the module:
[18]:
from pystata import stata
The run() function is used to execute Stata commands. One or multiple Stata commands can be specified.
[19]:
stata.run('sysuse auto, clear')
(1978 automobile data)
[20]:
stata.run('''
summarize
reg mpg price i.foreign
ereturn list
''')
.
. summarize
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
make | 0
price | 74 6165.257 2949.496 3291 15906
mpg | 74 21.2973 5.785503 12 41
rep78 | 69 3.405797 .9899323 1 5
headroom | 74 2.993243 .8459948 1.5 5
-------------+---------------------------------------------------------
trunk | 74 13.75676 4.277404 5 23
weight | 74 3019.459 777.1936 1760 4840
length | 74 187.9324 22.26634 142 233
turn | 74 39.64865 4.399354 31 51
displacement | 74 197.2973 91.83722 79 425
-------------+---------------------------------------------------------
gear_ratio | 74 3.014865 .4562871 2.19 3.89
foreign | 74 .2972973 .4601885 0 1
. reg mpg price i.foreign
Source | SS df MS Number of obs = 74
-------------+---------------------------------- F(2, 71) = 23.01
Model | 960.866305 2 480.433152 Prob > F = 0.0000
Residual | 1482.59315 71 20.8815937 R-squared = 0.3932
-------------+---------------------------------- Adj R-squared = 0.3761
Total | 2443.45946 73 33.4720474 Root MSE = 4.5696
------------------------------------------------------------------------------
mpg | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
price | -.000959 .0001815 -5.28 0.000 -.001321 -.000597
|
foreign |
Foreign | 5.245271 1.163592 4.51 0.000 2.925135 7.565407
_cons | 25.65058 1.271581 20.17 0.000 23.11512 28.18605
------------------------------------------------------------------------------
. ereturn list
scalars:
e(N) = 74
e(df_m) = 2
e(df_r) = 71
e(F) = 23.00749448574634
e(r2) = .3932401256962295
e(rmse) = 4.569638248831391
e(mss) = 960.8663049714787
e(rss) = 1482.593154487981
e(r2_a) = .3761482982510528
e(ll) = -215.9083177127538
e(ll_0) = -234.3943376482347
e(rank) = 3
macros:
e(cmdline) : "regress mpg price i.foreign"
e(title) : "Linear regression"
e(marginsok) : "XB default"
e(vce) : "ols"
e(depvar) : "mpg"
e(cmd) : "regress"
e(properties) : "b V"
e(predict) : "regres_p"
e(model) : "ols"
e(estat_cmd) : "regress_estat"
matrices:
e(b) : 1 x 4
e(V) : 4 x 4
e(beta) : 1 x 3
functions:
e(sample)
.
You can use the get_return(), get_ereturn(), and get_sreturn() functions to store Stata’s r(), e(), and s() results in Python as dictionaries.
[21]:
stata.get_ereturn()
[21]:
{'e(N)': 74.0,
'e(df_m)': 2.0,
'e(df_r)': 71.0,
'e(F)': 23.007494485746342,
'e(r2)': 0.39324012569622946,
'e(rmse)': 4.569638248831391,
'e(mss)': 960.8663049714787,
'e(rss)': 1482.5931544879809,
'e(r2_a)': 0.3761482982510528,
'e(ll)': -215.90831771275379,
'e(ll_0)': -234.39433764823468,
'e(rank)': 3.0,
'e(cmdline)': 'regress mpg price i.foreign',
'e(title)': 'Linear regression',
'e(marginsprop)': 'minus',
'e(marginsok)': 'XB default',
'e(vce)': 'ols',
'e(_r_z_abs__CL)': '|t|',
'e(_r_z__CL)': 't',
'e(depvar)': 'mpg',
'e(cmd)': 'regress',
'e(properties)': 'b V',
'e(predict)': 'regres_p',
'e(model)': 'ols',
'e(estat_cmd)': 'regress_estat',
'e(b)': array([[-9.59034169e-04, 0.00000000e+00, 5.24527100e+00,
2.56505843e+01]]),
'e(V)': array([[ 3.29592449e-08, 0.00000000e+00, -1.02918123e-05,
-2.00142479e-04],
[ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
-0.00000000e+00],
[-1.02918123e-05, 0.00000000e+00, 1.35394617e+00,
-3.39072871e-01],
[-2.00142479e-04, -0.00000000e+00, -3.39072871e-01,
1.61691892e+00]]),
'e(beta)': array([[-0.4889233, 0. , 0.4172175]])}
You can also push Stata datasets to Python as NumPy arrays or pandas DataFrames. Below, we store Stata’s current dataset into a pandas DataFrame, myauto.
[22]:
myauto = stata.pdataframe_from_data()
myauto.head()
[22]:
make | price | mpg | rep78 | headroom | trunk | weight | length | turn | displacement | gear_ratio | foreign | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | AMC Concord | 4099 | 22 | 3.000000e+00 | 2.5 | 11 | 2930 | 186 | 40 | 121 | 3.58 | 0 |
1 | AMC Pacer | 4749 | 17 | 3.000000e+00 | 3.0 | 11 | 3350 | 173 | 40 | 258 | 2.53 | 0 |
2 | AMC Spirit | 3799 | 22 | 8.988466e+307 | 3.0 | 12 | 2640 | 168 | 35 | 121 | 3.08 | 0 |
3 | Buick Century | 4816 | 20 | 3.000000e+00 | 4.5 | 16 | 3250 | 196 | 40 | 196 | 2.93 | 0 |
4 | Buick Electra | 7827 | 15 | 4.000000e+00 | 4.0 | 20 | 4080 | 222 | 43 | 350 | 2.41 | 0 |
You can instead choose to store just a subset of the data in Python. Below, we store the first 10 observations of the variables mpg and price into a pandas DataFrame.
[23]:
stata.pdataframe_from_data('mpg price', range(10))
[23]:
mpg | price | |
---|---|---|
0 | 22 | 4099 |
1 | 17 | 4749 |
2 | 22 | 3799 |
3 | 20 | 4816 |
4 | 15 | 7827 |
5 | 18 | 5788 |
6 | 26 | 4453 |
7 | 20 | 5189 |
8 | 16 | 10372 |
9 | 19 | 4082 |
On the other hand, you can read data from Python into Stata, making it the current dataset or loading it into a specific frame in Stata. Below, we load the pandas DataFrame myauto into Stata, making it the current dataset. Then, we list the first three observations. Here force is specified as True to clear Stata’s memory before the DataFrame is loaded.
[24]:
stata.pdataframe_to_data(myauto, force=True)
stata.run('list in 1/3')
+------------------------------------------------------------------------+
1. | make | price | mpg | rep78 | headroom | trunk | weight | length |
| AMC Concord | 4099 | 22 | 3 | 2.5 | 11 | 2930 | 186 |
|------------------------------------------------------------------------|
| turn | displa~t | gear_ra~o | foreign |
| 40 | 121 | 3.5799999 | 0 |
+------------------------------------------------------------------------+
+------------------------------------------------------------------------+
2. | make | price | mpg | rep78 | headroom | trunk | weight | length |
| AMC Pacer | 4749 | 17 | 3 | 3 | 11 | 3350 | 173 |
|------------------------------------------------------------------------|
| turn | displa~t | gear_ra~o | foreign |
| 40 | 258 | 2.53 | 0 |
+------------------------------------------------------------------------+
+------------------------------------------------------------------------+
3. | make | price | mpg | rep78 | headroom | trunk | weight | length |
| AMC Spirit | 3799 | 22 | . | 3 | 12 | 2640 | 168 |
|------------------------------------------------------------------------|
| turn | displa~t | gear_ra~o | foreign |
| 35 | 121 | 3.0799999 | 0 |
+------------------------------------------------------------------------+