The Italian Stata Users Group Meeting was held on 15 November 2018 at the I Portici Hotel. There was also an optional course on 16 November. You can view the program and presentation slides below.
Proceedings
9:30–10:30 | Session I: Invited Speaker
Abstract:
Multistate models are increasingly being used to model complex disease profiles.
By modeling transitions between disease states accounting for competing events
at each transition we can gain a much richer understanding of patient trajectories
and how risk factors impact over the entire disease pathway. In this presentation, I will introduce
some new Stata commands for the analysis of multistate survival data.
msset is a data preparation tool that converts a dataset from wide (one observation
per subject, multiple time and status variables) to long (one observation for each
transition for which a subject is at risk for). msaj calculates the nonparametric
Aalen–Johansen estimates of transition probabilities. msboxes creates a descriptive plot
of the multistate process through the transition matrix and numbers at risk. stms
fits joint transition-specific survival models, which allow each transition to have a different
parametric model. Yet each model is maximized jointly to enable sharing of covariate effects across
transitions. predictms calculates a variety of predictions from a multistate survival
model, including transition probabilities, length of stay (restricted mean time in each
state), the probability of ever visiting each state, and more. Predictions are made at
user-specified covariate patterns. One can calculate differences and ratios of predictions across covariate
patterns. Standardized (population-averaged) predictions can
be obtained. Confidence intervals for all quantities are available. One can use simulation or the
Aalen–Johansen estimator to calculate all quantities. One can calculate user-defined predictions
by providing a community-contributed Mata function to provide complete
flexibility. predictms can be used with a general transition matrix (cyclic or acyclic)
and allows the use of transition-specific timescales. I will illustrate the software using a
dataset of patients with primary breast cancer.
Additional information: italy18_Crowther.pdf
Michael J. Crowther
University of Leicester
|
10:50–12:30 | Session II: Community-contributed, I
Abstract:
We present CUB, a Stata module for modeling ordinal data
via a class of finite mixture distributions accounting for
both uncertainty and feeling components of an ordered
decisional process. This routine allows for modeling also
over dispersion, inflated categories, and large heterogeneity
occurrences. Model parameters are estimated by maximum
likelihood. We explore various features of the package CUB,
including simulation routines.
Additional information: italy18_Baum.pdf
Christopher F. Baum
Boston College
Giovanni Cerulli
IRCrES-CNR
Francesca di Iorio
Domenico Piccolo
Rosaria Simone
Università degli Studi di Napoli Federico II
Abstract:
There are many models for an outcome that a mass point at
a boundary value is continuously or discrete distributed
over a large number of off-boundary values. Two-part models
(TPMs), hurdle models (HMs), and zero-inflated models (ZIMs)
use different approaches to combine distinct models for
boundary and off-boundary values.
Except for a few "cake debate" papers whose assertions
were not accepted, the vast majority of the literature has
either assumed that the process determining when the
outcome is on or off the boundary must be exogenous or
that any endogeneity must be modeled. Drukker (2017)
showed that, contrary to conventional belief, TPMs are robust
to endogeneity of the on-off boundary process in that they
identify the mean of the outcome conditional on covariates.
In this presentation, I cover the following points:
Additional information: italy18_Drukker.pdf
David M. Drukker
StataCorp
Abstract:
An overriding goal of outcomes research is measuring and
comparing hospital performance using readily available
administrative data. Risk-adjustment techniques develop
from conventional logistic regression analysis, but some
precautions must be taken into account for the positive
correlation between observations from within the same
hospital. These include the use of generalized estimating
equations, which are available in Stata as xtgee. Two effective
graphs that illustrate outcome measures across different
providers and incorporate sample size information are the
caterpillar plot and the funnel plot, which can be obtained
using the eclplot and funnelcompar packages, respectively.
Additional information: italy18_Lenzi.pdf
Jacopo Lenzi
Università degli Studi di Bologna
|
12:15–1:00 | Session III: Exploiting the potential of Stata 15, I
Abstract:
Stata 15 includes three new commands for producing
dynamic documents: dyndoc, putdocx, and putpdf.
These commands have generated much interest in the user
community; this has led to a large amount of community-contributed
software. In this presentation, I'll give some tips
about how to use the commands efficiently both with official
Stata software and with some of these community-contributed
tools.
Additional information: italy18_Rising.pdf
Bill Rising
StataCorp
|
2:00–2:50 | Session IV: Community-contributed, II
Abstract:
In medical studies, the event of interest can often recur in the
same patient over time. Even if time-to-first event analysis or
the Poisson regression are still possible, they prevent
the use of data to its full potential. If the outcome can occur
more than once, failure times are correlated within subject and
methods accounting for lack of independence are needed.
Different statistical models are available for analyzing such
data, including the Andersen and Gill (AG) model, the
Prentice, Williams, and Peterson Total Time (PWP-TT) model,
and frailty models. I will review
statistical techniques for multiple-failure survival data and
show how to implement them in Stata.
Additional information: italy18_Ghilotti.pdf
Francesca Ghilotti
Rino Bellocco
Karolinska Institutet, Università degli Studi di Milano Bicocca
Abstract:
This presentation offers new evidence on the socioeconomic demographic
determinants of the referendum of 4 December 2016
through the analysis of the vote in Italian municipalities.
The results indicate a strong ideology of the vote in
that the political orientation expressed in the 2014
elections significantly influences its orientation. Moreover,
it is not so much the youth vote that has determined the
rejection of the reform as the social unease, summarized
through the unemployment rate and the share of commuters
in the municipalities. This also involves a reduced explanatory
capacity of the genuine territorial variables. Overall, the vote
was determined mainly by political affiliation, then by the
self-assessment of one's socio-economic status and, only
residually, by personal opinion on the contents of the reform.
If democracy is a result built up by the exercise of "electoral
profession", then the revival of civic education, updated in
the direction of enriching it with the economic implications
of good institutions, can be an incentive to actually express
preferences on the basic rules.
Additional information: italy18_Bella.pdf
Mariano Bella
Giovanni Graziano
Ufficio Studi Confcommercio
|
2:50–3:50 | Session V: Tricks and tips
Abstract:
This brief talk will show some simple tools for saving time when working
with Stata. This will be a hodgepodge of items whose goal is to reduce
the amount of thought, coordination, and human memory required of common
tasks in a complex work environment while speeding up such tasks greatly.
Additional information: italy18_Rising(2).pdf
Bill Rising
StataCorp
Abstract:
One of the lesser known functions in Stata is the possibility
to call external routines, written in other software, to perform
specific tasks within Stata. I offer some insights
on how to develop a Stata ado file embedding an external
software routine to execute in Stata using the Stata command
stree, written to allow users to run regression trees (a machine
learning technique currently unavailable in Stata).
Additional information: italy18_Cerulli.pdf
Giovanni Cerulli
Antonio Zinilli
IRCrES-CNR
Abstract:
From Stata 13 on, Stata supports the long string format strL.
One can use the programming function fileread() to upload
an entire text or binary file in a Stata long string field, found
in a local directory or uploaded from a webpage. These long
string fields can then be searched to extract specific numeric
or categorical data. I illustrate the use of the fileread()
programming function, coupled with the string functions
strpos() and substr(), to solve the following issues: i) extract
spatial coordinates from a database of individual addresses
using Google Map APIs calls; and ii) extract count data from a
nonanonymous version of multiple Word structured files and
automatically rebuild an anonymous PDF version of the file
through LaTeX.
Additional information: italy18_Capelli.pdf
Giovanni Capelli
Università degli Studi di Cassino e del Lazio Meridionale
|
4:00–5:00 | Session VI: Exploiting the potential of Stata 15, II
Abstract:
I discuss the average causal effect (ACE) of an
endogenous binary treatment on an ordinal outcome when
the sample is subject to endogenous selection. I show how
to estimate the ACE using an extended regression model
(ERM) command in Stata. I illustrate how to do regression
adjustment in Stata and discuss standard errors for
sample-averaged treatment effects and population-averaged
treatment effects.
Additional information: italy18_Drukker(2).pdf
David Drukker
StataCorp
|
5:00–5:15 | Session VII
Abstract:
Stata developers present will carefully and cautiously
consider wishes and grumbles from Stata users in the audience.
Questions, and possibly answers, may concern reports of
present bugs and limitations or requests for new features in
future releases of the software.
StataCorp personnel
StataCorp
|
8:15 |
Optional dinner at C'era Una Volta (Via Massimo D'Azeglio, 9) |