Econometricians have begun to devote more attention to spatial interactions
when carrying out applied econometric studies. In part, this is motivated by
an explicit focus on spatial interactions in policy formulation or market
behavior, but it may also reflect concern about the role of omitted
variables that are or may be spatially correlated.
The classic models of
spatial autocorrelation or spatial error rely upon a predefined matrix of
spatial weights W, which may be derived from an explicit model of
spatial interactions but which, alternatively, could be viewed as a flexible
approximation to an unknown set of spatial links similar to the use of a
translog cost function. With spatial panel data, it is possible, in
principle, to regard W as potentially estimable, though the number of
time periods would have to be large relative to the number of spatial panel
units unless severe restrictions are placed upon the structure of the spatial
interactions. While the estimation of W may be infeasible for most
real data, there is a strong, formal similarity between spatial panel models
and nonspatial panel models in which the variance–covariance matrix of
panel errors is not diagonal. One important variant of this type of model is
the random-coefficient model, in which slope coefficients differ across panel
units so that interest focuses on the mean slope coefficient across panel
units. In certain applications—for example, cross-country
(macro-)economic data—the assumption that reaction coefficients are
identical across panel units is not intuitively plausible. Instead of just
sweeping differences in coefficients into a general error term, the
random-coefficient model allows the analyst to focus on the common component
of responses to changes in the independent variables. At the same time, the
model also allows the analyst to retain the information about the error
structure associated with coefficients that are random across panel units but
constant over time for each panel unit.
At present, Stata’s spatial procedures include a range of user-written
routines designed to deal with cross-sectional spatial data. The
recent release of a set of programs (including spmat, spivreg,
and spreg) written by Drukker, Prucha, and Raciborski provides
Stata’s users with the opportunity to fit a wide range of standard
spatial econometric models for cross-sectional data. Extending such
procedures to deal with panel data is nontrivial, in part because there are
important issues about how panels with incomplete data should be treated. The
casewise exclusion of missing data is automatic for cross-sectional data, but
omitting a whole panel unit because some of the data in the panel are missing
will typically lead to a very large reduction in the size of the working
dataset. For example, it is very rare for international datasets on
macroeconomic or other data to be complete, so casewise exclusion of missing
data will generate datasets that contain many fewer countries or time periods
than might otherwise be usable.
The theoretical literature on econometric models for the analysis of spatial
panels has flourished in the last decade with notable contributions from
LeSage and Pace, Elhorst, and Pfaffermayr, among others. In some cases,
authors have made available specific code for the implementation of the
techniques that they have developed. However, the programming language of
choice for such methods has been MATLAB, which is expensive and has a fairly
steep learning curve for nonusers. Many of the procedures assume that there
are no missing data. In addition, the procedures may not be able to handle
large datasets, because the model specifications can easily become
unmanageable if either N (the number of spatial units) or T
(the number of time periods) becomes large.
In this presentation, I will cover a set of user-written maximum likelihood
procedures for fitting models with a variety of spatial structures, including
the spatial error model, the spatial Durbin model, the spatial
autocorrelation model, and certain combinations of these models (the
terminology is attributable to LeSage and Pace [2009]). A suite of MATLAB
programs to fit these models for both random and fixed effects has been
compiled by Elhorst (2010) and provides the basis for the implementation in
Stata/Mata. Methods of dealing with missing data, including the
implementation of an approach proposed by Pfaffermayr (2009), will be
discussed.
A second aspect of spatial panel models that will be covered in the
presentation concerns the links between such models and random-coefficient
models that can be fit using procedures such as xtrc or the
user-written procedure xtmg. The classic formulation of
random-coefficient models assumes that the variance–covariance model of
panel errors is diagonal but heteroskedastic. This is an implausible
assumption for most cross-country datasets, so it is important to consider
how it may be relaxed, either by allowing for explicit spatial interactions
or by using a consistent estimator of the cross-country
variance–covariance model.
The user-written procedures introduced in the presentation will be
illustrated by analyses of (a) state data on electricity consumption in the
U.S., and (b) country data on demand for infrastructure in the developing and
developed world.
References:
Elhorst, J. P. 2010. Spatial panel data models. In Handbook of Applied
Spatial Analysis, ed. M. M. Fischer and A. Getis, 377–407. Berlin:
Springer.
Le Sage, J., and R. Pace. 2009. A sampling approach to estimate the log
determinant used in spatial likelihood problems. Journal of Geographical
Systems 11: 209–225.
Pfaffermayr, M. 2009. Maximum likelihood estimation of a general unbalanced
spatial random effects model: A Monte Carlo study. Spatial Economic
Analysis 4: 467–483.
Additional materials:
Hughes.pdf