6. German Stata User's Group Meeting
====================================
The 6th German Stata Users Group Meeting will be held on Friday, 27th
June 2008 in Berlin at the WZB (Wissenschaftszentrum Berlin für
Sozialforschung). We would like to invite everybody from everywhere who
is interested in using Stata to attend this meeting.
The academic program of the meeting is being organized by Johannes
Giesecke ([email protected]), and Ulrich Kohler ([email protected]),
both at the WZB. The conference language will be English due to the
international nature of the meeting and the participation of non-German
guest speakers.
The logistics of the conference are being organized by Dittrich und
Partner, distributor of Stata in several countries including Germany,
The Netherlands, Austria, and Poland (http://www.dpc.de).
Program Schedule
-----------------
9:45 - 10:15 Registration
10:15 - 10:30 Welcome
10:30 - 11:15
Using instrumental variables techniques in economics and finance
Christopher F. Baum, [email protected]
Boston College Economics and DIW Berlin
I will discuss the usefulness of instrumental variables (IV) techniques
in addressing research questions in economics and finance. IV methods
provide workable solutions to problems of endogeneity, measurement
error and proxy variables, but are easily misused. A wide array of
diagnostic techniques that should be employed to validate the use of IV
in a particular context will be presented. I will also discuss the
advantages of employing the Generalized Method of Moments form of IV
(IV-GMM) and the Continuously Updated Estimator (GMM-CUE), and display
some newly developed code that efficiently employs Stata's Mata
programming language to implement the GMM-CUE estimator.
11:15 - 12:00
Ordinal regression models: Problems, solutions, and problems with the
solutions
Richard Williams,
Notre Dame Dept of Sociology
Ordered logit/probit models are among the most popular ordinal
regression techniques. These models often have serious problems,
however. The proportional odds/parallel lines assumptions made by
these methods are often violated. Further, because of the way these
models are identified, they have many of the same limitations as are
encountered when analyzing standardized coefficients in OLS regression,
e.g. interaction terms and cross-population comparisons of effects can
be highly misleading. This paper shows how generalized ordered
logit/probit models (estimated via gologit2) and heterogeneous
choice/location scale models (estimated via oglm) can often address
these concerns in ways that are more parsimonious and easier to
interpret than is the case with other suggested alternatives. At the
same time, the paper cautions that these methods sometimes raise their
own concerns that researchers need to be aware of and know how to deal
with. First, mis-specified models can create worse problems than the
ones these methods were designed to solve. Second, estimates are
sometimes implausible, suggesting that the data are being spread too
thin and/or yet another method is needed. Third, multiple and very
different interpretations of the same results are often possible and
plausible. Guidelines for identifying and dealing with each of these
problems are presented.
12:00 - 13:00 Lunch
13:00 - 13:30
Charts for comparing results between many categories
Ulrich kohler, [email protected]
WZB
Charts are useful tools for comparing a statistic between groups
defined by a categorical variable with many different categories. It
has turned out from a number of postings on Statalist that Stata's
standard implementation of these graphs with -graph dot- and -graph
bar- often limits the the users in their ambition to design such graph.
However, in most cases users' design wishes can be satisfied by
reverting to the low level command -graph twoway-.
This tutorial talk demonstrates the construction of charts with -graph
twoway-. It starts by re-constructing a simple bar-chart with -graph
twoway- and than moves to a number of extensions that are possible when
using -graph twoway-. I will illustrate some trickery with stored
results and local macros, as well as a number of useful user written
programs.
13:30 - 14:15
Graph Editing
Vince Wiggins, [email protected]
StataCorp
We will take a quick tour of the Graph Editor, covering the basic
concepts: adding text, lines, and markers; changing the defaults for
added objects; changing properties; working quickly by combining the
contextual toolbars with the more complete object dialogs; and using
the object browser effectively. Leveraging these concepts, we'll
discuss how and when to use the grid editor and techniques for combined
and by-graphs. Finally, we will look at some tricks and features that
aren't apparent at first blush.
14:15 - 14:45
Relative Distribution Methods in Stata
Ben Jann, [email protected]
ETH Zurich
The concept of the relative density seems like a fruitful nonparametric
approach to studying distributional differences between groups
(Handcock and Morris 1999), yet it appears that the technique has gone
more or less unnoticed in applied social science research. A scarcity
of canned software might be one of the reasons the method is
underutilized. Therefore, I present a new Stata command called
-reldist- to plot the relative density, decompose distributional
differences into location and shape effects, and compute relative
distribution summary measures. The command is illustrated by an
application comparing earnings by sex.
Reference:
Handcock, Mark S., and Martina Morris (1999). Relative Distribution
Methods in the Social Sciences. New York: Springer.
14:45 - 15:00 Coffee
15:00 - 15:30
Direct and Indirect effects in a logit model
Maarten Buis, [email protected]
Vrije Universiteit Amsterdam
In this presentation I discuss a method by Erikson et al. (2005) for
decomposing a total effect in a logit model into direct and indirect
effects and proposes a generalization of this method. Consider an
example where social class has an indirect effect on attending college
through academic performance in high school. The indirect effect is
obtained by comparing the proportion of lower class students that
attend college with the counterfactual proportion of lower class
students if they had the distribution of performance of the higher
class students. This captures the association between class and
attending college due to differences in performance, i.e. the indirect
effect. The direct effect of class is obtained by comparing the
proportion of higher class students with the counterfactual proportion
of lower class students if they had the same distribution of
performance as the higher class students. This way the variable
performance is kept constant, and thus result in the direct effect. If
these comparisons are carried out in the form of log odds ratios than
the total effect will equal the sum of the direct and indirect effects.
In its original form this method assumes that the variable through
which the indirect effect occurs is normally distributed. In this
article the method is generalized by allowing this variable to have any
distribution, which has the added advantage of simplifying the method.
15:30 - 16:00
Multiple Imputation using ICE: A Simulation Study on a Binary Response
Jochen Hardt, [email protected] and Kai Görgen
(Mathematical Statistics, Chalmers University, Göteborg, Sweden;
Masters Programme, Bernstein Center for Computational Neurocience, Berlin)
Background: Various methods for multiple imputations of missing values
are available in statistical software. They have been shown to work
well when small proportions of missings were to be imputed. However,
some researchers start to impute large proportions of missings.
Method: A simulation using ice was performed on datasets of
50/100/200/400 cases and 4/11/25 variables. A varying proportion of
data (3 – 63 %) were set missing completely at random and subsequently
substituted by multiple imputation.
Results: (1) It is shown when and how the algorithm breaks down by
decreasing n of cases and increasing number of variables in the model.
(2) Some unexpected results are demonstrated, i.e. flawed coefficients.
(3) Compared to the second programme that performs multiple imputations
by chained equations, i.e. “mice” in “R”, the stata programme “ice”
results in a slightly higher precision of the estimates by generally
very similar features of the programmes.
Conclusion: The imputation of missings by chained equations is a useful
tool for imputing small to moderate proportions of missings. The
replacement of larger amounts however can be critical.
16:00 - 16:30
Using Stata for a memory saving fixed effects estimation of the
three-way error component model
Thomas Cornelissen, [email protected]
Leibniz Universität Hannover
Researchers trying to estimate tens or houndreds of thousands of fixed
effects for two or more groups (workers and firms, pupils, teachers and
schools,etc.) in data sets with high numbers of observations are often
limited by the size of the computer memory available.
Such a model is commonly estimated by sweeping out one of the effects
by the fixed effects transformation (time-demeaning) and by including
the remaining effects as dummy variables.
If K is the number of fixed effects to be included as dummy variables
and N is the number of observations, then the design matrix is of
dimension N x K (neglecting any remaining right-hand side regressors).
The time-demeaned dummies have to be stored as “float” variables
consuming 8 bytes per cell in Stata. For example, with 2 million
observations (N) and 10 thousand fixed effects (K), the memory
requirement would be 160 gigabytes.
This paper describes how the memory requirement can be reduced to store
only a K x K matrix, which in the given example reduces the memory
requirement to below 1 gigabyte.
The paper also describes the Stata program felsdvreg.ado which
implements the method in Mata. Besides implementing the memory-saving
estimation method, the program also takes care of checking the
identification of the effects and provides useful summary statistics.
16:30 - 16:45 Coffee
16:45 - 17:30
Report to the users
Alan Riley, [email protected]
StataCorp
17:30 - 18:00 Wishes and grumbles
18:00 End of the meeting
Participants are asked to travel at their own expense. There will be a
small conference fee to cover costs for coffee, teas, and luncheons (35
Euro; Students: 15 Euro). There will also be an optional informal meal
at a restaurant in Berlin on Friday evening at additional cost.
You can enroll by contacting Anke Mrosek by email or by writing, phoning, or
faxing to
Anke Mrosek
Dittrich & Partner Consulting GmbH
Kieler Str. 17
42697 Solingen
Tel: +49 (0) 212 260 66-24
Fax:+49 (0) 212 260 66 -66
[email protected].
We look forward to seeing you in Berlin on Friday the 27th where you can help us to
make this an exciting and interesting event.
The conference venue is:
Wissenschaftszentrum Berlin
Reichpietschufer 50
10785 Berlin
(see http://www.wzb.eu)
Johannes Giesecke, Ulrich Kohler
--
[email protected]
030 25491-361
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/