I believe that SAS's dominance in the pharmaceutical industry is because of
its advantage in "reporting": the majority of the effort in preparing a
clinical trial report is expended by nonstatistician "clinical SAS
programmers" in preparing tables, listings and figures/graphs (TLFs or TLGs,
in industry parlance).
For an idea of what I'm talking about, take a look at pp. 345--372 from
www.fda.gov/ohrms/dockets/AC/07/briefing/2007-4308b1-02-fda-backgrounder.pdf
, where the U.S. Food and Drug Administration (FDA) has appended a portion
of a drug company's submission (complete with the company's "CONFIDENTIAL"
stamp) onto one of the agency's position papers. A clinical study report
can have thousands of pages of such listings (note the page numbers there).
Recent FDA policies and the advent of industrial standards such as CDISC
have reduced the amount of reporting, at least insofar as so-called case
report tabulations (CRTs) go, but the lion's share of effort in assembling
an industrial clinical trial report is still in TLF production.
SAS Institute offers PROC REPORT and PROC TABULATE for such reporting tasks.
The company has enhanced its product with the Output Delivery System (ODS),
which can render these tables and listings in rich text format (RTF) files
for ready incorporation as needed into the clinical study report writer's
Microsoft Word document, or render them in portable document format (PDF)
files for direct incorporation into an electronic dossier sent to government
regulators. (Just look at the thread "table output and -tabout-" that has
run contemporaneously with this thread to compare how Stata is in
analogously rendering the results of a reporting type activity into an
external software package.)
Once a pharmaceutical company or contract research organization (CRO) has
made the investment in SAS for TLFs, it doesn't make sense to license
another package to do the analysis tasks, especially if the other package
doesn't offer more than (or other than) what SAS/STAT already does. Note
that the pharmaceutical industry does routinely license other packages for
specialized analysis, for example, NONMEM (and S-Plus) for fitting nonlinear
mixed models in population pharmacokinetics. And such supplemental software
packages can include Stata when the analysis calls for it: see
www.fda.gov/ohrms/dockets/AC/05/briefing/2005-4095B2_02_03-Novartis-Zometa-App-2.doc
, Page 5, Section 4.2.4 "Analysis" for an example from a pharmaceutical
company. And see Page 250 of the PDF document cited above for an example
where the FDA, itself, uses a (user-written!) Stata module.
So, disreputable marketing practices by SAS Institute, mindless inertia,
ignorance of alternative packages' existence, and fear of FDA retribution
don't seem to be explanations for SAS's sway in the pharmaceutical industry.
Rather, it seems as if a business decision has been made in the industry
based upon such factors as meeting the most acute needs in putting together
a clinical trial report (reporting versus analysis); reducing support
requirements by keeping the number of software packages to a minimum
(perhaps a lesser consideration); and, as Gabi Huiber alluded in this
thread, taking advantage of the availability of a trained labor pool for
these reporting activities (clinical SAS programming is a specialization in
the industry; SAS Press even puts out a how-to book for the nonstatistician
SAS programmer in the pharmaceutical industry); and preserving the
investment--often made over a period of decades--in writing, validating and
maintaining a library of SAS Macros and custom PROC TABULATE and PROC REPORT
programs.
On the one hand, Stata's ease-of-use for data management gives it a natural
advantage in preparing the data into the form for use in creating TLFs.
Moreover, SAS's capability in handling enormous datasets doesn't lend it any
advantage in industrial clinical trial reports: the data are kept in
database applications such as Oracle's Oracle Clinical and Phase Forward's
Clintrial, from which views are exported as "domain"-specific SAS datasets
that are of readily digestible sizes even with studies considered "large" by
industrial clinical trial standards. And Stata's graphs are much more than
adequate for industrial clinical trial reports and are easier to program
than SAS/Graph-produced figures. On the other hand, Stata doesn't have the
table-and-listing reporting capabilities that SAS has. Correspondingly, the
kind of people in the biological and medical sciences who learn Stata are
more likely to be using it for analysis and aren't intending pursuing a
career in "clinical Stata programming" to produce tomes of TLFs for
industrial clinical trial reports.
FDA's Patient Profile Viewer initiative aside, reporting in industrial
clinical trials isn't the same thing as putting a dashboard on the monitor
in the corner office. Nevertheless, it's fun to speculate how the
much-anticipated consolidation in the business intelligence arena that took
place last year has changed the playing field for SAS Institute, not that
the company is taking it lying down. Assuming that the market isn't too
niche for them, could the new mergered-and-acquired business-intelligence
software players begin offering products that prompt pharmaceutical
companies and CROs to re-evaluate their investments in SAS libraries for
clinical trials reporting, make the availability of clinical SAS programmers
less important, and focus comparisons between SAS and alternatives on
analysis capabilities alone?
This is not to say that Stata is really ready to go head-to-head with
SAS/STAT either: to paraphrase what one of the other participants in this
thread mentioned during a wishes and grumbles session a couple of years ago
in Boston, StataCorp could stand to flesh out Stata a bit more on the
biostatistics side.
Joseph Coveney
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/