Nicholas J. Cox
Describing batches of data in terms of their order statistics or quantiles
has long roots but remains underrated in graphically based exploration,
data reduction, and data reporting. In 1990, Hosking proposed L-moments
based on quantiles as a unifying framework for summarizing distribution
properties, but despite several advantages they still appear to be
little known outside their main application areas of hydrology and
climatology. Similarly, the mode can be traced to the prehistory of
statistics, but it is often neglected or disparaged despite its value as a
simple descriptor and even as a robust estimator of location. This
presentation reviews and exemplifies these approaches with detailed
reference to Stata implementations. Several graphical displays are
discussed, some novel. Specific attention is given to the use of Mata for
programming core calculations directly and rapidly.
Additional information
njctalk_sweden2007.zip
Mixed-effects models contain both fixed and random effects. The fixed
effects are analogous to standard regression coefficients and are estimated
directly. The random effects are not directly estimated but instead are
summarized according to their estimated variances and covariances, known as
variance components. Random effects take the form of either random
intercepts or random coefficients, and the grouping structure of the data
may consist of multiple levels of nested groups. In Stata, one can fit
mixed models with continuous (Gaussian) responses by using
xtmixed and, in Stata 10, fit mixed models with binary and count
responses by using and xtmepoisson, respectively. All three commands
have a common multiequation syntax and output, and postestimation
tasks such as the prediction of random effects and likelihood-ratio
comparisons of nested models also take a common form. This
presentation will cover many models that one can fit using these
three commands. Among these are simple random intercept models,
random-coefficient models, growth curve models, and crossed-effects
models.
Additional information
gutierrez_sweden07.pdf (presentation slides)
Classification errors, selection bias, and uncontrolled confounding are
likely to be present in most epidemiological studies, but the uncertainty
introduced by this type of biases is seldom quantified. The authors present
a simple yet easy-to-use method to adjust the relative risk of a disease for
misclassification of a binary exposure, selection bias, and unmeasured
confounding variable. The accompanying Stata tool implements both ordinary
and probabilistic sensitivity analysis. It allows the user to specify a
variety of probability densities for the bias parameters, and use these
densities to obtain simulation limits for the bias adjusted exposure-disease
relative risk. The authors illustrate the method by applying it to a
published positive association between occupational resin exposure and
lung-cancer deaths in a case-control study. By employing plausible
probability distributions for the bias parameters, investigators can report
results that incorporate their uncertainties regarding unmeasured or
uncontrolled confounding, and thus avoid overstating their certainty about
the effect under study. These results can usefully supplement standard data
descriptions and conventional results.
Additional information
orsinietal_slide_7sep07.pdf (presentation slides)
Paul C. Lambert
Maarten L. Buis