Example 1: Basic usage¶
In this example, we will use the %%stata magic command to call Stata. Before getting started, we need to configure and import the pystata package to initialize Stata. See Configuration for more information on configuring the pystata package; below, we will be using the first method listed there to initialize Stata. In the first method, the configuration module stata_setup, which is available in the Python Package Index (PyPI), is provided to locate the pystata package to initialize Stata.
[1]:
import stata_setup
stata_setup.config("C:/Program Files/Stata17/", "mp")
___ ____ ____ ____ ____ ®
/__ / ____/ / ____/ 17.0
___/ / /___/ / /___/ MP—Parallel Edition
Statistics and Data Science Copyright 1985-2021 StataCorp LLC
StataCorp
4905 Lakeway Drive
College Station, Texas 77845 USA
800-STATA-PC https://www.stata.com
979-696-4600 [email protected]
Stata license: 10-user 4-core network perpetual
Serial number: 1
Licensed to: Stata Developer
StataCorp LLC
Notes:
1. Unicode is supported; see help unicode_advice.
2. More than 2 billion observations are allowed; see help obs_advice.
3. Maximum number of variables is set to 5,000; see help set_maxvar.
To illustrate calling Stata from Python, we use the German macroeconomic data discussed in Lütkepohl (2005). We are mainly interested in three variables: the first difference of the natural log of investment, dln_inv; the first difference of the natural log of income, dln_inc; and the first difference of the natural log of consumption, dln_consump. The values are recorded from the first quarter of 1960 through the fourth quarter of 1982.
First, we load the dataset, describe its contents, and display its time-series settings in Stata.
[2]:
%%stata
use https://www.stata-press.com/data/r17/lutkepohl2
describe
tsset
. use https://www.stata-press.com/data/r17/lutkepohl2
(Quarterly SA West German macro data, Bil DM, from Lutkepohl 1993 Table E.1)
. describe
Contains data from https://www.stata-press.com/data/r17/lutkepohl2.dta
Observations: 92 Quarterly SA West German macro
data, Bil DM, from Lutkepohl
1993 Table E.1
Variables: 10 4 Dec 2020 14:31
-------------------------------------------------------------------------------
Variable Storage Display Value
name type format label Variable label
-------------------------------------------------------------------------------
inv int %8.0g Investment
inc int %8.0g Income
consump int %8.0g Consumption
qtr byte %tq Quarter
ln_inv float %9.0g Log investment
dln_inv float %9.0g First-difference of ln_inv
ln_inc float %9.0g Log income
dln_inc float %9.0g First-difference of ln_inc
ln_consump float %9.0g Log consumption
dln_consump float %9.0g First-difference of ln_consump
-------------------------------------------------------------------------------
Sorted by: qtr
. tsset
Time variable: qtr, 1960q1 to 1982q4
Delta: 1 quarter
.
We then fit a vector autoregressive model with the var command.
[3]:
%%stata
var dln_inv dln_inc dln_consump if qtr<=tq(1978q4), lags(1/2) dfk
Vector autoregression
Sample: 1960q4 thru 1978q4 Number of obs = 73
Log likelihood = 606.307 AIC = -16.03581
FPE = 2.18e-11 HQIC = -15.77323
Det(Sigma_ml) = 1.23e-11 SBIC = -15.37691
Equation Parms RMSE R-sq chi2 P>chi2
----------------------------------------------------------------
dln_inv 7 .046148 0.1286 9.736909 0.1362
dln_inc 7 .011719 0.1142 8.508289 0.2032
dln_consump 7 .009445 0.2513 22.15096 0.0011
----------------------------------------------------------------
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
dln_inv |
dln_inv |
L1. | -.3196318 .1254564 -2.55 0.011 -.5655218 -.0737419
L2. | -.1605508 .1249066 -1.29 0.199 -.4053633 .0842616
|
dln_inc |
L1. | .1459851 .5456664 0.27 0.789 -.9235013 1.215472
L2. | .1146009 .5345709 0.21 0.830 -.9331388 1.162341
|
dln_consump |
L1. | .9612288 .6643086 1.45 0.148 -.3407922 2.26325
L2. | .9344001 .6650949 1.40 0.160 -.369162 2.237962
|
_cons | -.0167221 .0172264 -0.97 0.332 -.0504852 .0170409
-------------+----------------------------------------------------------------
dln_inc |
dln_inv |
L1. | .0439309 .0318592 1.38 0.168 -.018512 .1063739
L2. | .0500302 .0317196 1.58 0.115 -.0121391 .1121995
|
dln_inc |
L1. | -.1527311 .1385702 -1.10 0.270 -.4243237 .1188615
L2. | .0191634 .1357525 0.14 0.888 -.2469067 .2852334
|
dln_consump |
L1. | .2884992 .168699 1.71 0.087 -.0421448 .6191431
L2. | -.0102 .1688987 -0.06 0.952 -.3412354 .3208353
|
_cons | .0157672 .0043746 3.60 0.000 .0071932 .0243412
-------------+----------------------------------------------------------------
dln_consump |
dln_inv |
L1. | -.002423 .0256763 -0.09 0.925 -.0527476 .0479016
L2. | .0338806 .0255638 1.33 0.185 -.0162235 .0839847
|
dln_inc |
L1. | .2248134 .1116778 2.01 0.044 .005929 .4436978
L2. | .3549135 .1094069 3.24 0.001 .1404798 .5693471
|
dln_consump |
L1. | -.2639695 .1359595 -1.94 0.052 -.5304451 .0025062
L2. | -.0222264 .1361204 -0.16 0.870 -.2890175 .2445646
|
_cons | .0129258 .0035256 3.67 0.000 .0060157 .0198358
------------------------------------------------------------------------------
Next, we estimate impulse–response functions and forecast-error variance decompositions and save them under the name order1 in myirf1. Then, we graph the orthogonalized impulse–response function, using dln_inc as the impulse variable and dln_consump as the response variable.
[4]:
%%stata
irf create order1, step(10) set(myirf1, replace)
irf graph oirf, impulse(dln_inc) response(dln_consump)
. irf create order1, step(10) set(myirf1, replace)
(file myirf1.irf created)
(file myirf1.irf now active)
(file myirf1.irf updated)
. irf graph oirf, impulse(dln_inc) response(dln_consump)
.