Last updated: 24 May 2010
2010 Mexican Stata Users Group meeting
29 April 2009
Universidad Iberoamericana, Mexico City campus
Prolongación Paseo de la Reforma 880
Lomas de Santa Fe, México C.P. 01219
Distrito Federal, México
Estimation of treatment effects for social program evaluation
Omar Stabridis
Janet Zamudio
Mario Paulín
Consejo Nacional de Evaluación de la Política de Desarrollo Social
Because impact evaluation is an important tool that guides public policy
decisions and because applying impact evaluation is a rigorous process, we
must generate examples of how impact evaluation methodologies apply to the
Mexican context. To this end, we have used nonexperimental methodologies to
estimate treatment effects for Mexican social programs.
In order to quantify the effects that a social program has on its
beneficiaries’ welfare and productive activities, we have used Stata to
estimate treatment effects and to generate an adequate database with
information from the Mexican Family Life Survey for the years 2002 and 2005.
Two central Stata commands were used: pscore and psmatch2. The pscore
command estimates the propensity score and stratifies individuals according
to the propensity-score distribution, using for this a series of covariables
that are assumed to be related to both treatment status and the result
variable. This command also checks that the balancing property is satisfied.
psmatch2 performs a variety of matching estimation methods to
obtain estimates of the average treatment effect on the treated.
Additionally, we used database handling commands such as foreach,
merge, collapse, gen, egen, recode, and
replace.
Our panel will discuss the advantages and disadvantages of these
commands when applied to the evaluation of social programs.
Stata as a tool for transparency and statistics dissemination:
Measuring multidimensional poverty in Mexico
Víctor H. Pérez
Dulce Cano
Rocío Espinosa
Consejo Nacional de Evaluación de la Política de Desarrollo Social
In 2009, CONEVAL (Consejo Nacional de Evaluación de la
Política de Desarrollo Social) presented the official
methodology to measure multidimensional poverty in Mexico, which is a set of
intuitive indicators that measure income and social rights deprivation,
taking into account the territorial context. To follow the
principles of technical rigor, transparency, and impartiality, CONEVAL
decided to publish all the necessary elements to reproduce its
multidimensional poverty measures, including (a) adopted methodology, (b)
databases, and (c) Stata and SPSS programs used for generating the indexes. In
this presentation, we will show how Stata and SPSS were used to produce the
Mexican multidimensional poverty measures as well as the process to
“equalize” both programs.
Additional information
mex10sug_perez.pdf
Measuring poverty at state level using Stata
Carlos Guerrero de Lizardi
Manuel Lara Caballero
Instituto Tecnológico de Estudios Superiores de Monterrey
Using the approach proposed in 2002 by the Technical Committee for the
Measurement of Poverty (TCMP), CONEVAL produced a set of state-level poverty
measures. The methodology consists of comparing a food basket that contains
the minimum consumption requirements with a household’s average
income. Data from the National Survey of Household Income and Expenditure
(ENIGH) are used in the calculations. Currently, state poverty measures are
calculated using the National Consumer Price Index (NCPI) published by
BANXICO as a unique deflator. Hence, a major pending issue is correcting for
regional differences in the cost of living. In this talk, we will describe a set of
do-files that implement such a correction and will underline the main
methodological and policy implications behind the correction.
Additional information
mex10sug_lizardi.pdf
Hierarchical linear models using Stata
Delfino Vargas Chanes
Colegio de México
Maria Merino
ITAM
Some surveys collect data of individuals who are nested within hierarchical
organizations or countries. These data are useful, for instance, for ranking
countries according to a major outcome adjusted for covariates. Reporting
only means produces rankings that are biased. So it is necessary to
incorporate covariates and acknowledge the hierarchical structure of the
data. From the perspective of ordinary regression, such structuring
constitutes a statistical problem because it violates the assumption that
observations are independent and identically distributed. In such a context, a hierarchical, or multilevel,
linear model can be fit so that the hierarchical nature of the data is
explicitly modeled. In this presentation, we will briefly discuss the strengths and
limitations of hierarchical models for ranking countries.
Additional information
mex10sug_merino.ppt
Generating descriptive statistics from the MXFLS
Alicia Santana Cartas
Universidad Iberoamericana
In this presentation, I aim to show how to produce informative
descriptive statistics from a longitudinal survey using the Mexican Family
Life Survey (MXFLS) as an example. I will introduce the audience to the
MXFLS and discuss its main innovative features, such as the sample design,
the attitudes toward the risk module, and the migration module (including the
monitoring and rate of recontact). Then I will show how to tabulate the
data in an informative way and how to produce descriptive statistics using
the provided survey weights.
Additional information
mex10sug_santana.pdf
Keynote lecture: Estimation of count-data panel models
Pravin K. Trivedi
Indiana University
In this talk, I will cover a number of topics related to the estimation of panel
models for count data, with empirical illustrations estimated using
Stata. For the theoretical background, I will rely on my book with Colin
Cameron,
Microeconometrics: Methods and
Applications (2005, Cambridge University Press). Some of my
illustrations will be based on material in my recent book with Colin
Cameron,
Microeconometrics Using
Stata (2009, Stata Press), but several others will be based on as
yet unpublished material. This talk will be operational in orientation and, for
specificity, I will rely on examples estimated in Stata. I plan to cover the
following topics:
- nonlinear panel-data modeling for exponential mean models
- fixed- and random-effects panel models for the Poisson and negative binomial regression
- nonlinear GMM estimation of Poisson panel regression with sample selection or endogenous regressors
- dynamic panel Poisson regression with correlated random effects
- dynamic panel Poisson regression with linear feedback
- finite mixture models for panel Poisson regression
Additional information
mex10sug_trivedi.pdf
Bivariate dynamic probit models for panel data
Alfonso Miranda
Institute of Education, University of London
In this talk, I will discuss the main methodological features of the
bivariate dynamic probit model for panel data. I will present an example using simulated
data, giving special emphasis to the initial conditions
problem in dynamic models and the difference between true and spurious
state dependence. The model is fit by maximum simulated likelihood.
Additional information
mex10sug_miranda.pdf
Selection-bias correction based on the multinomial logit: An
application to the Mexican labor market
Luis Huesca
Mario Camberos
Economics Department,
Centro de Investigación en Alimentación y Desarrollo
In this presentation, we illustrate an application of a relatively new selection-bias correction methodology based on the multinomial logit model using the
selmlog Stata command (Bourguignon, Fournier, and Gurgand, 2007,
Journal of Economic Surveys 21: 174–205).
selmlog allows for getting both
consistent and efficient estimates of the selection process and a fairly
good correction for the outcome equation, even when the independence of
irrelevant alternatives (IIA) assumption is not achieved. The
exercise depicts the current pattern of the occupational choices for the
individuals in the Mexican labor market using a longitudinal panel with
microdata from the Encuesta Nacional de Ocupación y Empleo (ENOE)
during February 2008 to March 2009. We estimate an equation over an endogenously
selected population. The command grants simplicity for both distributional
and IIA assumptions for parametric models.
Additional information
mex10sug_huesca.pdf
mex10sug_huesca.ppt
Generalized method of moments estimators in Stata
David Drukker
StataCorp LP
Stata 11 has the new command
gmm for estimating parameters by
generalized method of moments (GMM).
gmm can estimate the parameters
of linear and nonlinear models for cross-sectional, panel, and time-series
data. In this presentation, I provide an introduction to GMM and to the
gmm command.
Additional information
mex10sug_drukker.pdf
Using Stata to analyze size frequency of the life cycle of a Mexican desert spider
Irma Gisela Nieto-Castañeda
María Luisa Jiménez-Jiménez
Isaías H. Salgado-Ugarte
Centro de Investigaciones Biológicas del Noroeste, S.C. y FES Zaragoza UNAM
In biology, the study of the life cycle of plants and animals helps one to
understand the phenology of a particular species, which is useful in pest
management or in biological conservation. Spiders are one of the most
widespread animals on earth. They eat a huge variety of other animals
and are good indicators of environmental changes. We studied for the first
time the life cycle of an endemic desert spider (
Syspira tigrina). Many
spider researchers have used the direct estimation of the number of instars
to describe the arachnid life cycle. Other methods are based on the analysis
of the length-frequency throughout time (indirect methods). Length-frequency
distributions are commonly analyzed by histograms. However, this procedure
depends on grid origin, and the interval width is discontinuous and uses a
fixed interval width. These problems have motivated the interest of
statisticians in alternative, more computationally intensive
methods. Kernel density estimators (KDEs) do not depend on the origin
position and are continuous distribution estimators. In addition, there are several
methods for choosing the interval width. In this study, we present in Stata the use
of KDEs to examine length-frequency distributions of
spider size in combination with the traditional approach using histograms.
Additional information
mex10sug_castaneda.pdf
ML modeling capabilities: Stata vs Gauss
Armando Sánchez Vargas
Institute for Economic Research, UNAM
The main purpose of this work is to discuss Stata’s capability
to implement customized likelihood functions compared with Gauss’s. I
compare these two high-level programming languages with built-in function
libraries and graphic routines. Overall, Stata’s features seem best
suited for analyzing specific models of decision-making processes and other
microeconometric applications, while Gauss is ideal for analyzing a more
ample range of statistical issues based on maximum likelihood estimation.
I briefly discuss such modeling capabilities, emphasizing what is
still needed and what might be refined.
Additional information
mex10sug_vargas.pdf
Analyzing data from complex survey designs
Isabel Cañette
StataCorp LP
This presentation is a tutorial on how to analyze complex survey data in
Stata. I will start by reviewing the sampling methods most frequently used
for survey data and examining why a special treatment is needed to
perform estimations using these data. I will discuss the concepts of
stratification, clustering, sampling weights, and finite population
correction, and illustrate how to account for them by using the
svyset command. Once the declaration on
svyset has been done,
estimations can be performed by simply adding the
svy prefix to a
Stata command; I will show some examples.
I will also discuss the variance estimators implemented in Stata for survey
data: linearized, jackknife, and balanced repeated replications.
Finally, I will also explain how Stata deals with subpopulation estimation, and I will explain the
use of poststratification.
Additional information
mex10sug_canette.pdf
Scientific organizers
Alfonso Miranda, (chair) University of London,
Landy Sanchez Peña, Colegio de México
Logistics organizers
MultiON Consulting, the official distributor
of Stata in Mexico.