From: David Vaughan <[email protected]>
Subject: st: On choosing a stats package...
Date: Fri, 24 Jan 2003 09:09:59 +1100
Hello
Please excuse my interruption of the statistical flow :-)
I have subscribed to this list because Stata 8 is on my short list of
possible statistical packages for use under OS X 10.2 and I wanted to
get a feel for what the users are finding about the product. I have
used Data Desk (versions 4 to 6) for eight years but must either wait
ages for a new version under OS X, continue using Data Desk in Classic
mode (unhappy) or choose something else like Stata or JMP or the like.
I used Data Desk 5 and 6 extensively for years (and tried SPSS, SYSTAT,
JMP, SAS, and R) and then switched to Stata 7 in 2001. This was not
because I disliked Data Desk. I switched to Stata because I also needed
programming facilities, which were not transparent to me in DD6, and
only promised in DD7 which is currently vaporware. You should know that
DD7 is delayed as the company is focused on consulting products like
that with the NCI and "Health Data Desk". DD6 is an exceptional EDA and
general statistical package. The EDA tools in DD6 (brushing, slicing,
hotlinks, sliders, etc.) are so good that the company has spun off a
consulting group and is interacting with the National Cancer Institute
for gene array visualization tools (John Wallace, sit up and take
notice). But for programming, Stata 8 is better, better than DD6, and
certainly better than SPSS, which currently has no sensible syntax for
programming in the OS X version from what I can tell. I have a PhD in
Psychology and I did not choose SPSS. SAS is more expensive but a good
modular package, if you use Windows. To decide between Stata 8 and JMP
for OS X is a tough choice if you've already been at DD6. JMP is
programmable and is very EDA oriented--I think it was designed to kill
Data Desk. It is hard to see how a small company like Data Desk can
compete with SAS. Note that SAS bought and then killed StatView to give
JMP an edge. This might explain DD7's delay. In any case, Stata 8 is
not a live EDA package in the way JMP or DD6 are. I've expressed my
point of view to Bill Gould that a combination of Stata 8 and Data Desk
would be an amazing product, but wonder if the external text based .ado
files are compatabile with the speed requirements for a set of EDA
tools in large data set. Currently not. Perhaps this is my only concern
with Stata 8. On the other hand with what other product can you express
your opinion to the president of the company and get a response? The
community is bar none. I have not been very disappointed with Stata 7
and 8. If it's any indication I paid the upgrade price out of pocket
and I'm an academic postdoc! There are amazing things you can do at the
command line, with practice. Stata is elegant. For the times I want a
rotating plot or sliders I fire up OS 9--I can still rotate a few
thousand points without any jerkiness. Most of the time I fire up Stata.
I am not a professional scientist or statistician (although the field
was part of my tertiary education) but use a stats package and
modelling tools from time to time in both consulting-related analysis
of large-volume data communications logs, performance and transactions
data and in academic analysis of financial market notions. (Don't worry
trying to work out the moonlighting.)
My uses may be limited in that I tend to use basic scatterplots,
histograms, box plots and rotating plots to get a feel for the data,
then (after transformations and summary statistics) linear models and a
little cluster analysis in testing hypotheses, or extracting factors
for entry into models for simulations. High-quality scientific (not
"marketing") charts are important in presenting findings because in the
consulting area much of my work is problem management where I often
find I have to disprove current pet theories and later convince people
about real causes and solutions. You could say I am oriented to
language and diagrams rather than to mathematics and formulae.
Currently, Stata 8 has better graphing (publication quality)
capabilities than either DD6 or JMP. Of course DD6 denies even trying
to do publication quality graphs. Their graphs are for inquiry and
publication quality graphs are slow. Again, you will not find rotating
plots in Stata 8, though I think for everything else you mention Stata
8 is for you. But I think it best if you give JMP and Stata 8 a trial
run in OS X with demo versions for a month and see what you think.
That's what I did.
My impression of Stata 8 is that is is considerably more powerful than
I might exploit and also that it may not have the ease of use for a
semi-casual user. I would expect to use menus rather than a language
simply so I do not have to learn by rote a product which is used in
bursts rather than daily or weekly. I want something which does not
unduly limit what I can achieve but is exploitable without first
requiring I become a high priest of the faith. It should give me the
opportunity to extend my statistical learning rather than make it
pre-requisite for every step.
Stata 8 is now menu and dialog driven, if you prefer that.
Would anyone like to comment on Stata 8 in relation to my needs as I
have expressed them? Can you offer comparative experience with other
packages (especially Data Desk, JMP or SPSS)?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/