Last updated: 9 June 2009
2009 Mexican Stata Users Group meeting
23 April 2009
Universidad Iberoamericana, Mexico City campus
Prolongación Paseo de la Reforma 880
Lomas de Santa Fe, México C.P. 01219
Distrito Federal, México
Proceedings
Decomposition of the Gini coefficient using Stata
Alejandro López Feldman
Economics Department, Universidad de Guanajuato
The Gini coefficient is widely used to measure inequality in the
distribution of income, consumption, and other welfare proxies. Decomposing
this measure can help you understand the determinants of inequality. In this
presentation, I will use income data from Mexico to illustrate a
user-written command,
descogini, that implements the Gini
decomposition proposed by Lerman and Yitzhaki (1985,
Review of Economics
and Statistics 67: 151–156). Using this command, the Gini
coefficient for total income can be decomposed in three terms: how important
the income source is with respect to total income; how equally or unequally
distributed the income source is; and how the income source and the
distribution of total income are correlated. In the presentation, I will
also illustrate how to obtain the impact that a marginal change in a
particular income source will have on total income inequality, as well as
how to obtain bootstrap standard errors.
Additional information
mex09sug_alf.pdf
Stata in the measurement and analysis of poverty in Mexico
Héctor H. Sandoval
Rodrigo Aranda Balcazar
Martín Lima
Consejo Nacional de Evaluación de la Política de Desarrollo Social
Following the General Law of Social Development, the National Council of
Evaluation of Social Development Policy (the acronym for its name in Spanish
is CONEVAL) has the responsibility to establish the criteria to define,
identify, and measure poverty in Mexico. To develop this assignment, CONEVAL
primarily uses the information from censuses and surveys carried out by the
National Institute of Statistics (INEGI). This type of data usually requires
the intensive use of statistical software, which facilitates its analysis;
in this way, Stata is the prime tool to elaborate work on poverty. Among the
principal products that CONEVAL has presented using Stata as a platform are:
1) an income poverty measure from 1992 to 2006, 2) an estimation of the
Social Gap Index 2005, and 3) all the data at the Executive Report of
Poverty, Mexico 2007. As well, the versatility of Stata allowed us to
process all the census data to develop Income Poverty Maps 2000–2005.
The primary objective of this presentation is to exemplify how we have used
Stata to estimate and analyze poverty in CONEVAL’s publications.
Additional information
mex09sug_hs.pptx
A review of Stata SVAR modeling capabilities
Armando Sánchez Vargas
Institute for Economic Research, UNAM
In this presentation, I will discuss Stata’s capability to implement
the entire SVAR methodology with nonstationary series. In the presence of
cointegration, the structuralization of a VAR model takes place at two
distinct stages: the first is the identification of the long-run equilibrium
relationships, and the second stage is the identification of the short-run
interactions. I will briefly discuss such methodology and the available
facilities in Stata to carry it out, emphasizing what is still needed and
what might be refined.
Additional information
mex09sug_asv.ppt
Multilevel modeling of ordinal responses
Sophia Rabe-Hesketh
University of California–Berkeley
Ordered categorical responses can be analyzed with different kinds of
logistic regression models, the most popular being the cumulative logit or
proportional odds model. Alternatively, ordinal probit models can be
specified. When the data have a nested structure, with repeated
observations for the same individual (as in longitudinal or panel data), or
students nested in schools, these models can be extended by including random
effects. I will describe the models and show how they can be estimated using
gllamm. I will mention some elaborations of the models such as
nonproportional odds and heteroskedastic errors. Finally, I will discuss how
to obtain different types of predicted probabilities for these models to
assess model fit, to visualize the model graphically, and to make inferences
for individual units.
Additional information
mex09sug_srh.zip
Dealing with the cryptic survey: Processing labels
and value labels with Mata
Alfonso Miranda
Institute of Education, University of London
Survey data comes often as a plain table containing cryptic variable names,
numbers, and letters. To make sense of the data, the researcher is given a
questionnaire or a code book that contains a list of variable names, their
description, and an interpretation of the values (either a number or a
string) that each variable can take. Code books are commonly provided as
plain text or in PDF format. Hence, the researcher is left
“free” to type labels and value labels one by one. This often
leads to bad research habits, such as “cutting” and
“processing” the piece of survey the researcher needs in the
short-run and leaving the rest for future processing. Obviously, this is
boring, time consuming, and eventually leads to the creation of various
versions of the same survey, an inability to track important changes, and an
incapacity to reproduce research results—because the researcher cannot
recreate the analyzed dataset step by step from the original source. In this
talk, I will discuss how to recover the information that is contained in
questionnaires or code books and how to process this information in a clean,
fast, and efficient way with Mata.
Additional information
mex09sug_am.pdf
Some improved Stata ado-files for nonparametric smoothing procedures
Isaías H. Salgado Ugarte
FES Zaragoza, UNAM
In this talk, I introduce some improved programs for nonparametric smoothing
that originally were written in a very simple manner. These updated
ado-files are simple too, but they are more versatile and more
“Stata-like” than the original versions. The ado-files include,
for density traces,
boxdent (boxcar weight function) and
dentrace (boxcar and cosine weight functions); for choosing the
smoothing parameter in density-frequency estimation,
bandw (which
permits kernel specification with automatic bandwidth adjustment); for
direct and discretized variable bandwidth density estimation,
varwiker and
varwike2, respectively; for finding critical
bandwidth for a specified number of modes,
critiband; and for
nonparametric assessment of multimodality,
bootsamb (to use in
conjunction with the
boot command). In spite of its simplicity, this
collection of commands has proved to be very useful in the analysis of
biological (and other kinds of) data, saving the analyst considerabe amounts
of time and effort.
Additional information
mex09sug_isu.pptx
Cointegrating VAR models and probability forecasting in Stata
Gustavo Sánchez Bizot
Senior Statistician, StataCorp
I discuss two applications of the
vec commands in this presentation.
First, I use the cointegrating VAR approach discussed in Garrat et al.
(2006,
Global and National Macroeconometric Modelling: A Long-run
Structural Approach) to fit a vector-error correction model. In contrast
with the application of the traditional Johansen statistical restrictions
for the identification of the coefficients of the cointegrating vectors, I
use Stata to show an alternative specification of those restrictions based
on the approach by Garrat et al. Second, I apply probability forecasting to
simulate probability distributions for the forecasted periods. This approach
produces probabilities for future single and joint events, instead of only
producing point forecasts and confidence intervals. For example, we could
estimate the joint probability of two-digit inflation combined with a
decrease in the GDP.
Additional information
mex09sug_gs.ppt
Determinants and consequences of property tax collection in Mexico
Daniel Broid
Secreataría de Hacienda y Crédito Público (SHCP)
In this presentation, I will investigate the determinants of property tax
collection in Mexico. The tax is paid by all owners of land and dwellings in
Mexico for the right of holding their properties, and is collected and
managed by municipal authorities at the local level. This type of tax has
attractive economic features such as efficiency, progressiveness, and good
capacity to finance local public goods. However, the amount of public funds
that are raised through the collection of this tax are extremely low. This
presentation will describe the main results of the study and show how Stata was
used to perform the analysis.
Additional information
mex09sug_db.pdf
Predicting counterfactual densities with the DFL
ado-file: A pertinent constructive critique
Luis Huesca
Economics Department, CIAD
It seems that the user-written
dfl command has a problem when using
micro-unit data without weighting, because its estimates of densities
integrate to more than one. This situation produces densities that need to
be corrected before a proper empirical analysis can be carried out. In this
presentation I will suggest a way of rescaling the outcome variables by
applying weights to densities before the kernels are estimated using the
Jenkins and Van Kerm (2005,
Journal of Economic Inequality 3:
43–61) technique. I present an example of earnings in the Mexican
labor market by subgroup population shares, and show that the probability
density function decomposition approach is more accurate once the estimates
of densities do not exceed the value of one.
Additional information
mex09sug_lh.pptx
Analysis of micro data from ENIGH using Stata
Juan Francisco Islas Aguirre
Economics Division, CIDE
Surveys such as the Encuesta Nacional de Ingresos y Gastos de los Hogares
(ENIGH, or the National Survey of Household Income and Expenditure) offer
many opportunities for the design, estimation, and testing of applied models
in social science. The ENIGH is also a valuable source of case studies that
can be used as real-life examples for teaching and learning. In this
presentation, I discuss a series of exercises from the ENIGH that are used
for teaching statistics and econometrics with Stata. I emphasize how to use
the facilities of Stata as a learning tool.
Additional information
mex09sug_jfi.pdf
Reproducible research: Weaving with Stata and StatWeave
Bill Rising
Director of Educational Services, StataCorp
Reproducible research is one of many names for the same concept: writing
one report document that contains both the report and the commands from a
statistical or programming language needed to produce the results and
graphics contained in the report. It is called reproducible research because
any interested researcher can then reproduce another’s entire report
verbatim. (Programmers call this same concept literate programming.) The
utility of reproducible research documents extends far beyond research or
programming. They allow rapid updates should there be additional data. They
can also be used in teaching for generating differing examples or test
questions, because different parameters will generate different examples. In
this presentation, I will show you how to use a third-party application to
embed Stata code, as well as its output, in either LaTeX or OpenOffice
documents. I will also use example documents (including the talk itself) to
show how you can update a report, its results, and its graphics by using new
data or changing parameters.
Additional information
mex09sug_br.zip
Scientific organizers
Alfonso Miranda, University of London
Isaías H. Salgado Ugarte, UNAM
Logistics organizers
MultiON Consulting, the official distributor
of Stata in Mexico.