Report to users
William W. Gould (President, StataCorp)
[minutes by Stephen P. Jenkins]
- Stata is about open development
Bill Gould began with an exposition of the philosophy behind Stata.
His opening remark was: "When Stata is 100% ado-files, then StataCorp can give
up and go home". In Stata 7, 82% of all Stata commands are defined by
ado-files; in Stata 6 the figure was 79%.
Stata is about open development with users on almost-as-near status as
developers. This is reflected in
- design of the software
- an open language
- same for StataCorp and users
- StataCorp must develop the language
- backwards compatibility essential or else users won't invest in adding new features
- documentation
- must document programming tools well
- distribution (historically the STB; increasingly the Net)
- important that distribution does not require approval of StataCorp or
users' innovation might be curtailed, hence STB was made mostly
independent of StataCorp
- Statalist developed (all by itself) and a very popular medium for
exchanging programs
- STB sales have not kept up with Stata sales
- net commands added to Stata
- carefully designed so that user sites could exist without StataCorp
approval
- search facility needed; currently provided by StataCorp, but if you look
at the innards of net search (webseek in Stata 6), you'll
find that the net search engine can specify any provider
- update carefully separated from net so that users don't
confuse StataCorp contributions (supported, with lots of certification)
from others
- SSC–IDEAS (Boston software archive) invented itself and has been very
successful
- the overall result that there are likely to be changes in the STB; more
details shortly.
- The evolution of Stata, version 6 to 7
During the life of Stata 6 (January 1999 – December 2000), there were 63
ado-file updates, i.e., one every 11 days on average; 4.7% of source code
lines were added; and 7.7% lines of ado code. Certification scripts also
increased (do-file lines up 37%).
Various other statistics were also shown.
Lots changed internally inside Stata when moving from 6 to 7, mainly to
accommodate a range of new features such as SMCL (pronounced "smickel"!)
- Why SMCL? Why is it seen to be so important? [And why not a standard mark-up language like HTML instead?]
- need to be able handle real-time output, e.g., iteration log. (HTML has the
idea of an end of document.)
- desires to add viewing features
- SMCL is currently just a text mark-up language but has the potential to do
very much more, and will do so in future, e.g., table formatting, smart
translations, user choice about what is produced (do you want to output
just coefficients and standard errors, say, or be able to add in means?)
The idea is that all information may be in the log, but user can choose
how to display it or parts of it.
- 'secret' translation commands: try the following (currently) rudimentary
commands for translating SMCL to html/tex:
-
log texman filename filename.tex [, replace]
log html filename filename.html [, replace]
- SMCL development as a new window driver (currently Help and Results) —
it can control multiple windows. There is the potential for, e.g., multiple
Results windows or Help windows open.
- SMCL clickable
- SMCL could also be used to facilitate real-time sharing of Stata. Possible
perhaps in future to send your Results window, via internet, to another
user's Results window. Similarly, command window. Clearly implications for
security and firewalls need to be resolved. Has potential for use in e.g.,
Technical Support. What do users think?
- The new file command
Bill described the features of the new file command, which provides the
ability to read and write ASCII text and binary files. [Released via
update the preceding week] NB can be used to write/save matrices and
later reload them. Example program code was shown.
- Sabbatical scheme
Stata is keen to develop this further. The idea is for someone to bring an
idea with them to StataCorp for a 6-month period and to work on it, and to
generally interact with and swap Stata-ish ideas. (Jeroen Weesie has just
completed a very successful stint at StataCorp under the scheme.)
Wishes and grumbles session
addressed to: William W. Gould and Roberto G Gutierrez (StataCorp)
[minutes by Stephen P. Jenkins]
The usual rules applied. All comments and suggestions were noted, with no
cast-iron promises made. But indications were given as to whether something
would be treated as relatively high priority, would be considered, or would
be treated as something less than that!
Here are the notes of the proceedings, minuted in the order in which the
remarks came. Suggestions are in italics, followed by the response.
C interface A current project and already working inside the
StataCorp building; will be for sale in about 6 months. (Bill reminded users
of issues of cross-platform portability — would need to compile on
multiple platforms.)
update executable is clumsy in Win — nice if could make
easier Perhaps in the next release.
could view log file while Stata running in version 6 but not in version
7 Wasn't aware of a problem and asked to be sent evidence so that
could act on it.
ability to combine tables, e.g., save option so that could stack
up For the next release; Jeroen Weesie is taking the lead on this.
Involves a major revamp of estimates to become more user-orientated
rather than programmer orientated.
eform option in cloglog (allows hazard ratio interpretation when used for
discrete time hazard modelling) Can do this.
reshape to preserve variable and value labels Hard to do at
present. Some one had suggested a route via use of characteristics?
datasets used in manuals to be put on the WWW Working on this. Some
problems with permissions to be resolved.
ltable update or, alternatively, estimates of hazard rates from
sts list and sts graph (not just integrated hazard)
Not clear what the response was on this (Question was asked last year too;
when said would look into.)
non-linear GLS program (so that could do, e.g., minimum distance
estimation) No commitment; will pass on to Vince Wiggins!
some clumsiness in handling log open and close Some problems were
mentioned (your minute-taker forgets the details); will be looked into.
clogit to have robust and cluster options Doesn't it? OK will
look into.
'Detonator' graphs Once explained what these were, many in
audience didn't think that these were a priority!
tmp files being left around arising after do-file failures Asked
for documentary evidence in order to look into.
make Stata easier to use for non-technical users Being looked
into already (quite apart from existing StataQuest).
more regression diagnostics after poisson (very few compared to
glm) OK
multiple line styles for xline and yline options in graph Maybe.
WWG said: "I hope that this question and ones like it will soon never have to
be asked again". [... an implicit acknowledgement of lack of progress on
graphics. The audience were all very restrained on this issue!]
nonconstant option in streg After some clarification about why one
might want this, said might look into.
clickable links to data sets on WWW (see above)
smoothed hazard rate estimates Mostly turned into discussion
about what this actually involved. [NB see K. Simons's presentation on this
topic, with ado-files.]
any plans for more on Generalized Additive Models No plans at
present; see Patrick Royston's GAM program in STB.
program debugger no comment
more support for editors that are public domain and good (e.g., vi,
TextPad) No comments made; some mention also of the emacs
environment for Stata.
contour plots, etc. "In the short term, the goal is for new
graphics to do everything that they currently do, but better. In the future,
capabilities like this should be in-built."