2004 German Stata Users Group meeting

Home / Resources & support / Users Group meetings / 2004 German Stata Users Group meeting

Last updated: 27 July 2004
Photo used with permission of the WZB

2004 German Stata Users Group meeting

Monday, 5 April 2004

Wissenschaftszentrum Berlin (WZB)
Reichpietschufer 50
D-10785 Berlina
Germany

Materials documenting the meeting

Original meeting announcement

The second German Stata Users Group Meeting will be held on Monday, 5 April 2004 in Berlin at the Wissenschaftszentrum Berlin (WZB).

The content of the meeting is being organized by Johannes Giesecke, Humboldt University Berlin ([email protected]) and Ulrich Kohler, WZB ([email protected]).

Presentations will focus on three main topics: user-written Stata programs, research and teaching experience using Stata, and critiques of Stata facilities in specific fields. The conference language was in English due to the 'international' nature of the meeting and the participation of non-German guest speakers.

Proceedings

Circular statistics in Stata, revisited

Nick Cox, Durham University, UK

Abstract

Circular data are a large class of directional data, which are of interest to scientists in many fields, including biologists (movements of migrating animals), meteorologists (winds), geologists (directions of joints and faults), and geomorphologists (landforms, oriented stones). These examples are all recordable as compass bearings relative to North. Other examples include phenomena that are periodic in time, including those dependent on time of day (in biomedical statistics: hospital visits or times of birth) or time of year (in applied economics: unemployment or sales variations).

The analysis of circular data is an odd corner of statistical science that many never visit, even though it has a long and curious history. Moreover, it seems that no major statistical language provides direct support for circular statistics.

This talk describes the development and use of some routines that have been written in Stata, primarily to allow graphical and exploratory analyses. In 2004, such routines are being rewritten, especially to allow use of the new graphics of Stata 8.

Fitting functional forms to distributions, using -ml-

Stephen P. Jenkins, ISER, University of Essex

Abstract

This talk will describe some programs to fit generalized beta of the second kind, Singh-Maddala, Dagum, and lognormal distributions to data on income or indeed any other skewed variable of interest. The programs allow the key distributional parameters to vary with covariates, and also handle svy data. (The programs use features introduced to ml in version 8.1.) To assess goodness of fit graphically, one can draw q-q and p-p plots using programs written by Nick Cox.

Additional information

Jenkins.pdf

Tabulation of multiple response sets, revisited

Ben Jann, ETH Zurich

Abstract

At the first German Stata Users Group meeting, Hildegard Schaeper raised the issue of tabulating multiple response sets with Stata. Hildegard presented two of her own programs to deal with multiple responses and identified a number of remaining problems. In my contribution, I will re-address the issue and present a revised and considerably extended module to compute one- and two-way tables of multiple responses. The new program handles dichotomously or polytomously held response sets, calculates absolute frequencies as well as frequencies relative to responses and/or cases, supports string variables, appropriately labels rows and columns, allows complex case selection and specification of a list or range of valid responses, offers significance tests for two-way tables, and optionally saves response indicator variables. Tables are neatly formatted and split into pieces if too wide to fit the screen. The program is byable and weights are allowed.

Additional information

Jann.pdf

Playing with Stata dialogs: An enhanced recent file list

Dankwart Plattner, KfW Frankfurt

Abstract

Stata dialogs are somewhat cumbersome and difficult to implement. However, they are also powerful and helpful, especially when used together with ado-files. With the help of the latter, dialogs can even be sort of dynamic, as the example I present shows. To my knowledge, this has never been done before.

Stata lacks a recent file list. It is replicated and enhanced with this start-up-dialog, which presents the user with three lists of data files opened before (recent 5, 5 most popular, all) and allows one to chose among them to open one or select a file never used before, open it and add it to the file lists. In addition, one can select a proper log file to use with the opened file and set the memory needed to open the data file (the dialog proposes a suitable value). The user may also enter a description for each file opened in order to have a better overview over her projects. The dialog may be most useful on start-up but can be called also during a current session. It closes open files (and saves them on request) in order to open the selected ones.

The dialog presented shows how to exchange values with ado-files (back and forth). It also shows how one can debug dialog scripts and programs, albeit in a very rough manner only. Some limitations of Stata dialogs are also discussed. Several additions and enhancements to the dialog are possible.

Biplots, revisited

Ulrich Kohler, WZB

Abstract

Biplots display correlations and differences in means and standard deviations of many variables on one graph, together with the values of the plotted variables and approximations of the Euclidean distance between the observations. Biplots are useful for identifying clusters of observations, guiding interpretation of factor analyses, detecting multivariate outliers, and getting an idea about the correlation structure of the data. The talk will demonstrate the merits of biplots and discuss the development of a new version of biplot.ado for Stata 8.2.

Additional information

Kohler.pdf

Generalized partially linear models

Roberto Gutierrez, StataCorp

Abstract

Partially linear models are linear regression models where one component is allowed to vary nonparmetrically. Generalized partially linear models generalize this case from linear regression to the quasi-likelihood setting of standard GLIMs, thus encompassing a larger class models including logistic, Poisson, and Gamma regression. Athough estimation for these models is possible in official Stata via fractional polynomials, this approach is entirely nonparametric and uses a local-linear smooth to estimate the "nonlinear" component. The Stata command gplm for fitting generalized partially linear models is discussed and demonstrated.

Additional information

Gutierrez.pdf

Conditional logit versus random coefficient models: An analysis using GLLAMM

Peter Haan, DIW

Abstract

Estimating labor supply functions using a discrete rather than a continuous specification has become increasingly popular in recent years. The main advantage of the discrete choice approach compared to continuous specifications derives from the possibility to model nonlinearities in budget functions. However, the standard discrete choice approach, the conditional logit model, is based on some restrictive assumptions. Econometric literature has suggested more general discrete choice models. However, these less restrictive specifications have shown to incur very high computational cost, which might obstruct the estimation of confidence intervals of marginal effects or elasticities. It is therefore of particular interest for applied research, which approach is more adequate when analyzing discrete choice models.

In my analysis, I estimate different model specifications of a household utility function drawing on micro data of the GSOEP. For the estimation, I employ the Stata program GLLAMM, developed by Sophia Rabe-Hesketh et al. (2001). The idea is to test whether the results derived from the different specifications differ significantly. My findings suggest that for computational reasons, standard discrete choice models that are more restrictive in their assumptions regarding error variances, seem to represent the adequate model choice for the analysis of labor supply functions on basis of the GSOEP.

Additional information

Haan.pdf

Effects of macroeconomic uncertainty on leverage for US non-financial firms

Andreas Stephan and Oleksandr Talavera, European University Viadrina, DIW

Abstract

In this paper, we investigate the link between optimal level of leverage and macroeconomic uncertainty. Using the model of firm's value maximization, we show that as macroeconomic uncertainty increases, captured by an increase in the variability of industrial production or inflation, firms decrease their optimal levels of borrowing. We test this prediction on a panel of non-financial US firms drawn from the annual COMPUSTAT quarterly database covering the period 1990-2000 and find that as macroeconomic uncertainty increases, firms behave to decrease their levels of leverage. Our results are robust with respect to the inclusion of macroeconomic factors, such as interest rate, inflation, and index of leading indicators.

Additional information

Talavera.pdf

The potential determinants of German firms' technical efficiency: An application using industry level data

Oleg Badunenko

Abstract

This paper explores the distribution of the technical efficiencies across German manufacturing industries and looks at the association of technical efficiency to other economic categories. Aggregating 1995 to 2001 firm-level data yields an unbalanced panel with 241 cross-sections (industries). While the unbalanced nature of the data precludes some time-varying specifications, one can estimate the parameters of a time-invariant fixed-effects model. With only one industry being fully efficient, the rest perform poorly, having an efficiency mode of .32. To account for outliers 7 industries were dropped from the sample (a 2.9% reduction of the sample). In the smaller sample, the estimated mode of technical efficiency is .78. The distribution of TE is only slightly positively skewed, contrary to the rationale for using a one-sided distribution for the efficiencies. This problem has been noticed by other researchers, and so far the only solution proposed has involved changing the assumed distribution for the technical efficiencies. However, since fixed-effects estimation does not assume a particular distribution for the firm level inefficiencies, our purified-of-outliers scores of technical efficiencies can be trusted and used as endogenous variable in further analysis.

Additional information

Badunenko.pdf

Influence of fertility on women's participation in the labor market and their wages — the alternative cost of having a child

Joanna Cieciel and Andrzej Tomaszewski, Department of Economics, University of Warsaw

Abstract

Children require not only financial expenditures but also expenditures of time; thus, the number of children and distribution of their births in time remains in conflict with the aspiration of parents after career and their quest for satisfactory work. Cost of child includes not only expenditures of parents on goods and services but also alternative costs of time devoted to bringing up children, resulting from a loss of the part or the whole of income due to having the child. This problem applies mainly to women, who despite social transformations and growing occupational activity continue to be main suppliers of time for children care. Maternity restrains a woman's possibilities in the labor market not only through reduction of hours that she can spend at work. She gets lower wages also because of a disturbed career and smaller mobility than a childless woman. A prolonged gap in occupational activity results as well in a decrease of the long-term ability to gaining income — it diminishes total net income obtained during the lifetime (fewer years in work). This entails lower savings for a retirement fund. This paper consists of empirical estimations of models of women's participation in labor market taking into account endogeneity of fertility, which are subsequently employed as a selection equation in Heckman model of influence of having children on mothers' wages. Thus, we attempt to assess the fraction of income lost by a woman who decided to have children. We employ a cross-sectional and panel-data model on household budgets in Poland and Germany. The data used in estimation are taken from a database created by the Consortium of Household Panels for European Socio-economic Research (CHER) with the exception of the cross-sectional model for Poland, which is estimated on a broader survey conducted by the Polish Central Statistical Office.

Additional information

Ciecielag_Tomaszewski.pdf

Stata and the newcomer

Svend Juul, Department of Epidemiology and Social Medicine, University of Aarhus, Denmark

Abstract

During a long history with a lot of people involved, Stata has grown and flourished. It seems, however, that the needs of the newcomer don't get the attention they deserve. I switched from SPSS to Stata three years ago, and I am happy now, but I still remember my initial troubles. Also, when teaching Stata to new users, I see them repeatedly encounter the same problems and difficulties.

During the presentation, I will demonstrate some shortcomings of Stata for new users. I will also give constructive suggestions for improvements.

Additional information

Juul.pdf