Configuration¶
The pystata Python package allows you to call Stata from within Python. Below, we list the programs and packages you will need to use the pystata package, and then we discuss different methods you can use to configure it.
Requirements¶
To call Stata from within Python by using the pystata package, the following combination is needed:
Dependencies¶
To use the pystata package with full functionality, the following Python packages will need to be installed:
Configuration¶
The pystata Python package is shipped with Stata and located in the pystata subdirectory of the utilities folder in Stata’s installation directory. For example, if you install Stata in C:\Program Files\Stata17, then the pystata package will be located in the C:\Program Files\Stata17\utilities\pystata\ directory. The package is placed there for convenience, to avoid conflicts between official updates to Stata and updates to the pystata Python package. Stata’s installation directory is stored in the c(sysdir_stata) macro. You can type the following in Stata to view the name of this directory:
. display c(sysdir_stata)
When you try to import the pystata package in your Python environment, an exception will be raised claiming no module is named pystata. Python cannot locate it because the pystata package is stored in Stata’s installation directory, which is not on Python’s system module search path. (You can see sys.path, from Python’s sys module, to see the list of directories in this search path.)
There are, however, several ways to import the pystata package in the Python environment. Below, we show you four methods to configure the package. For simplicity, we will refer to the Stata installation directory as STATA_SYSDIR, meaning that the pystata subdirectory is located in the STATA_SYSDIR\utilities\ directory. When implementing one of the methods below, be sure to replace STATA_SYSDIR with the directory in which your copy of Stata is installed. If you get output similar to that shown below for your edition of Stata, it means that everything is configured properly.
___ ____ ____ ____ ____ ®
/__ / ____/ / ____/ 17.0
___/ / /___/ / /___/ MP—Parallel Edition
Statistics and Data Science Copyright 1985-2021 StataCorp LLC
StataCorp
4905 Lakeway Drive
College Station, Texas 77845 USA
800-STATA-PC https://www.stata.com
979-696-4600 [email protected]
Stata license: 10-user 4-core network perpetual
Serial number: 1
Licensed to: Stata Developer
StataCorp LLC
Notes:
1. Unicode is supported; see help unicode_advice.
2. More than 2 billion observations are allowed; see help obs_advice.
3. Maximum number of variables is set to 5,000; see help set_maxvar.
Otherwise, send your output to our technical support team at tech-support@stata.com.
Method 1: Installing via pip¶
To enable Python to find Stata’s installation path and the pystata package, we provide the Python module stata_setup. The config() function defined in the stata_setup module is used to locate the pystata package within this module. This function has two arguments: the first one is Stata’s installation path and the second one is the edition to use. The edition argument can be one of mp, se, or be, which represent Stata/MP, Stata/SE, and Stata/BE editions, respectively.
The simplest way to install this setup module is to use the Python package manager pip from the Python Package Index (PyPI). Open a Windows Command Prompt and type
> pip install --upgrade --user stata_setup
Or open a macOS or Unix terminal and type
$ pip install --upgrade --user stata_setup
This will install the stata_setup module and the dependencies for the pystata package.
The other way to install the stata_setup module is to download the source code, which is stata_setup-0.1.3.zip for Windows and stata_setup-0.1.3.tar.gz for Linux and Mac OS X.
After you download it to your local drive, change into that directory. In the Windows Command Prompt, type
> pip install stata_setup-0.1.3.zip
Or in a macOS or Unix terminal, type
$ pip install stata_setup-0.1.3.tar.gz
Suppose your Stata is installed in STATA_SYSDIR and you have the Stata/MP edition. You can configure Stata within the Python environment as follows:
>>> import stata_setup
>>> stata_setup.config('STATA_SYSDIR', 'mp')
If Stata is configured correctly, stata_setup.config() will return with the splash screen above with Stata’s logo and initialization message. To suppress these messages, set the splash argument to False, as follows:
>>> stata_setup.config('STATA_SYSDIR', 'mp', splash=False)
By default, splash is True. This argument is added in version 0.1.3.
Method 2: Adding pystata to sys.path¶
The most direct way to locate the pystata package is to add the pystata subdirectory’s location in Python’s module search path. In your Python environment, you can type
>>> import sys
>>> sys.path.append('STATA_SYSDIR/utilities')
>>> from pystata import config
>>> config.init('mp')
If it is configured correctly, config.init() should return with no error and the splash screen above with Stata’s logo and initialization message is displayed. If you want to suppress those message, you can set the splash argument to False. See The config module for more information.
Method 3: Changing your current working directory¶
In the Python environment, the current working directory is automatically on the module search path, so you can also locate the pystata package by changing your current working directory to STATA_SYSDIR\utilities\.
>>> import os
>>> os.chdir('STATA_SYSDIR/utilities')
>>> from pystata import config
>>> config.init('mp')
Method 4: Editing PYTHONPATH¶
PYTHONPATH is a Python environment variable storing a list of paths that are added to the default module search path when the Python environment is initialized. So, you can add STATA_SYSDIR\utilities to PYTHONPATH to locate the pystata package directly, without having to manipulate sys.path or change your current working directory. Note that you just need to configure PYTHONPATH once and STATA_SYSDIR\utilities will be loaded to the module search paths by default. This is more convenient than the two methods shown above, which would require you to manipulate sys.path or change your current working directory every time you want to import the pystata package in the Python environment.
Windows users can use the following steps to set this environment variable:
For Windows 10 users, open the Control Panel, click on the System and Security link, and then click on the System link. Then click on the Advanced system settings link and select the Environment Variables… button. The process may be different with other Windows systems.
Under the User variables section for your login ID, click on New…, enter PYTHONPATH for the variable name, and specify STATA_SYSDIR\utilities for the variable value.
Click on OK to close the New User Variable window, OK to close the Environment Variables window, and OK again to close the System Properties window.
For Linux and Mac OS X users, you can set it permanently in your
~/.bashrc
or ~/.bash_profile
file,
$ export PYTHONPATH=STATA_SYSDIR/utilities:$PYTHONPATH
or in your ~/.cshrc
file,
$ setenv PYTHONPATH STATA_SYSDIR\utilities:${PYTHONPATH}
After you are done, you can check whether it was set successfully by typing in the Windows Command Prompt
> echo %PYTHONPATH%
or in a macOS or Unix terminal
$ echo $PYTHONPATH
Next, in your Python environment, you can type
>>> from pystata import config
>>> config.init('mp')
to check whether the pystata package was located and Stata was initialized successfully.
Note
In contrast to the first configuration method, if you use any of the last three configuration methods, you will have to install the numpy, pandas, and ipython dependencies yourself. If you installed Python using a prepackaged distribution, you may already have them installed. If not, you can install them via the Windows Command Prompt by typing
> pip install --upgrade --user numpy pandas ipython
or via a macOS or Unix terminal by typing
$ pip install --upgrade --user numpy pandas ipython
See pip - The Python Package Installer for more information about installing Python packages.