The first French Stata Users Group meeting was Thursday, 6 July 2017 but you can view the program and presentation slides below.
Proceedings
9:30–10:00 |
Abstract:
SDMX, which stands for Statistical Data and Metadata eXchange,
is an ISO standard developed by seven international organizations
(BIS, ECB, Eurostat, IMF, OECD, the United Nations, and the World
Bank) to facilitate the exchange of statistical data 1. The package
sdmxuse allows Stata users to
download and import SDMX data directly within their favorite software.
The program builds and sends a query to the statistical agency
(using RESTful web services), then imports and formats the downloaded
dataset in XML format. The complex structure of the datasets
(so-called cube) is reviewed to show how users can send specific queries
and import only the required time series. sdmxuse might prove useful
for researchers who need frequently updated time series and wish to
automate the downloading and formatting process.
Additional information: France17_Fontenay.pdf Download sdmxuse from SSC
Sébastien Fontenay
Université Catholique de Louvain
|
10:00–10:30 |
Abstract:
The difference-in-differences estimator measures the effect of a
treatment or policy intervention by comparing change over time of
the outcome variable across treatment groups. To interpret the
estimate as a causal effect, this strategy requires that, in the
absence of the treatment, the outcome variable followed the same
trend in treated and untreated groups. This assumption may be
implausible if selection for treatment is correlated with
characteristics that affect the dynamic of the outcome variable.
In this presentation, I describe the command absdid, which implements
the semiparametric difference-in-differences (SDID) estimator of
Abadie (2005, Review of Economic Studies 72: 1-19). The SDID
is a reweighing technique that addresses the imbalance of
characteristics between treated and untreated groups. Hence, it
makes the parallel trend assumption more credible. In addition,
the SDID estimator allows the use of covariates to describe how the
average effect of the treatment varies for different groups of the
treated population.
Additional information: France17_Houngbedji.pdf Download difference-in-differences estimator from SSC
Kenneth Houngbedji
Paris School of Economics
|
11:00–11:30 |
Abstract:
Subjective measurement scales consist of questionnaires aiming at
measuring non-observable respondent characteristics, such as quality
of life, pain, or intelligence. The questionnaires can be unidimensional
(they measure one concept) or multidimensional (they measure several
concepts), so they can lead to one or several scores supposedly measuring
the concepts of interest. In classical test theory (CTT), the scores
are a combination (sum or mean) of responses to one or several
items. To be useful, a questionnaire must provide psychometric
properties showing that the instrument correctly measures what
it intends to measure. The two main properties we want to assess are
validity and reliability. Validity and reliability are assessed by checking
their respective facets: content validity, construct validity, and criterion
validity for validity; internal consistency, test-retest reliability, and
scalability for reliability. Most of these properties can be assessed
using statistical analyses (factor analysis, intraclass correlation
coefficients, etc.). However, there is currently no statistical software
package to easily perform all of these tests. We developed validscale,
a Stata module that performs the recommended analyses to validate a
subjective measurement scale using CTT. A dialog box was also developed to
use the module in a user-friendly manner.
Additional information: France17_Perrot.pdf
Bastien Perrot
Université de Nantes, Université de Tours, INSERM, SPHERE U1246
|
11:30–12:15 |
Abstract:
Bayesian analysis has become a popular tool for many statistical applications.
Yet many statisticians have little training in the theory of Bayesian analysis
and software used to fit Bayesian models. This talk will provide an intuitive
introduction to the concepts of Bayesian analysis and demonstrate how to fit
Bayesian models using Stata. No prior knowledge of Bayesian analysis is
necessary and specific topics will include the relationship between likelihood
functions, prior and posterior distributions, Markov Chain Monte Carlo (MCMC)
using the Metropolis–Hastings algorithm, and how to use Stata's graphical user
interface and command syntax to fit Bayesian models.
Additional information: France17_Huber.pptx
Chuck Huber
StataCorp
|
2:00–2:30 |
Abstract:
These surveys have complex sampling designs and use multiply imputed
variables. These two characteristics need to be taken into account
to obtain correct standard errors, but they are often forgotten
by users because of the complexity of this design. repest has been
conceived to easily incorporate them into any eclass stata command.
repest also includes a set of tools to facilitate the exploitation of
international surveys. If you want to have a proper look at this work,
the ado-file is available on the IDEAS website, along with a detailed
help file. Please note that repest is also compatible with other surveys
such as TIMMS or IALS.
Additional information: France17_Keslair.pptx Download the repest ado-file from SSC
François Keslair
OCDE
|
2:30–3:00 |
Abstract:
Markov chain and mixture models have been widely applied in various
strands of the academic literature. Several studies have combined
both modeling approaches to account for unobserved heterogeneity
within a population when analyzing dynamic processes. For instance,
a restricted form of this combined approach, the so-called
mover-stayer model (MSM), has been used to investigate agents mobility
in sociology, economics, or medical sciences. This paper describes mixmcm,
a user-written Stata command that allows estimating the general class of
mixed Markov chain models (MMCM). To account for the possibility of
incomplete information within the data, the model is estimated with
maximum likelihood (ML) using the expectation-maximization (EM)
algorithm. The proposed command enables users to estimate the MMCM
parametrically, semiparametrically, or nonparametrically, depending
on the chosen specifications for the transition probabilities and the
mixing distribution. The MSM is obtained from this general setting by
imposing relevant restrictions on the transition probability matrices.
Dealing with the general model, mixmcm also enables one to endogenously
identify the optimal number of homogeneous chains. A postestimation
command is also provided for further inspection and analysis of results.
The usefulness of the proposed command is illustrated with an application
in the field of agricultural economics to analyze farm-size dynamics.
Additional information: France17_Saint-Cyr.pdf
Legrand Saint-Cyr
SMART, Agrocampus Ouest, INRA
|
3:00–3:30 |
Abstract:
Stata has a set of built-in commands for cluster analysis. While
they are solid and effective, they have some limitations. I present
several utilities that extend Stata's cluster analysis capability,
particularly, but not exclusively, when working from matrices of pairwise
distances rather than variables (that is, when using clustermat
rather than cluster).
permtab and ari compare cluster solutions, respectively, by permuting categories to maximize Cohen's kappa and calculating the adjusted Rand Index. calinski and dudahart implement the Calinski–Harabasz and Duda–Hart stopping rules for clustering from pairwise distance matrices (official Stata calculates these for clustering from variables only). Calculating these indices from the distance matrix means they can be applied to other measures than squared Euclidean distance, even when clustering from variables. Studer's discrepancy measure is closely related and links these measures to ANOVA-like procedures on distance matrices. silhouette and dendrohmap provide graphical summaries, respectively, the silhouette plot (which captures the relative distinctness of clusters) and a heat map representation of the pairwise distances (ordered by the cluster dendrogram). I also present a command, pam, to do partitioning around medoids using distance matrices (similar to cluster kmedians when working from variables). Additional information: France17_Halpin.pdf
Brendan Halpin
University of Limerick
|
4:00–4:45 |
Abstract:
Quantile plots show ordered values (raw data, estimates, residuals,
etc.) against rank or cumulative probability or a one-to-one
function of the same. Even in a strict sense, they are almost 200
years old. In Stata, quantile, qqplot, and qnorm
go back to 1985 and 1986. So why any fuss?
The presentation is built on a long-considered view that quantile plots are the best single plot for univariate distributions. No other kind of plot shows so many features so well across a range of sample sizes with so few arbitrary decisions. Both official and user-written programs appear in a review that includes side-by-side and superimposed comparisons of quantiles for different groups and comparable variables. Emphasis is on newer work, with focus on the compatibility of quantiles with transformations; fitting and testing of brand-name distributions; quantile-box plots as proposed by Emanuel Parzen (1929–2016); equivalents for ordinal categorical data; and the question of which graphics best support paired and two-sample t and other tests. Commands mentioned include distplot, multqplot, and qplot (Stata Journal) and mylabels, stripplot, hdquantile, and lvalues (SSC). Additional information: France17_Cox.pptx
Nicholas J. Cox
Durham University
|
4:45–5:15 |
Abstract:
In many applications, such as biological and agricultural growth
processes and pharmacokinetics, the time course of a continuous response
for a subject over time may be characterized by a nonlinear function.
Parameters in these subject-specific nonlinear functions often have
natural physical interpretations, and observations within the same
subject are correlated. Subjects may be nested within higher-level
groups, giving rise to nonlinear multilevel models, also known as
nonlinear mixed-effects or hierarchical models. The new Stata 15
command menl allows you to fit nonlinear mixed-effects models, in
which fixed and random effects may enter the model nonlinearly at
different levels of hierarchy. In this talk, I will show you how to fit
nonlinear mixed-effects models that contain random intercepts and slopes
at different grouping levels with different covariance structures for
both the random effects and within-subject errors. I will also discuss
parameter interpretation and highlight postestimation capabilities.
Additional information: France17_Assaad (http:)
Houssein Assaad
StataCorp
|
5:15–5:45 |
Bill Gould
StataCorp
|
Organizers
Scientific committee
Salima Bouayad
Center of Research in Economy and Statistics (CREST)
Yann le Strat
Public Health France
José Miguel Gaspar
Essec Business School
Logistics organizer
The logistics organizer for the 2017 French Stata Users Group meeting is
Ritme,
scientific solutions,
the distributor
of Stata in Belgium, France, and Switzerland.
View the proceedings of previous Stata Users Group meetings.