[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: RE: Towards publication quality output

From	"Webb.Bayard" <[email protected]>
To	"'[email protected]'" <[email protected]>
Subject	st: RE: Towards publication quality output
Date	Fri, 22 Aug 2003 08:23:17 -0700
As a new Stata user and a novice LaTeX user I have a related suggestion. I
have hunted down various utilities for producing a variety of output
suitable for inclusion in LaTeX documents, but the whole arrangement is less
than coherent. Combined with my amateur status, the prospect of publishing
the results of my work in Stata becomes overwhelming. A manual or book on
creating LaTeX documents with Stata output would be more than welcome.

My dream would be to have full support for LaTeX output and publication
included in Stata, fully documented and integrated into the menu and help
system. Second choice would be the availability of a book at the Stata
Bookstore that included a CD with a variety of ado and style files, examples
and other useful tidbits.

Am I the only one that feels overwhelmed? Is there wide support for this?
Personally, I would pay up to US$100 for a robust book with lots of goodies
on a CD.

===================
    Bayard Webb



-----Original Message-----
From: Marcello Pagano [mailto:[email protected]]
Sent: Friday, August 22, 2003 5:52 AM
To: [email protected]
Subject: st: Towards publication quality output


For Fred, m.p.
____________________


During the reign of Stata 7, the most common complaint was about the 
absence of publication quality graphics. With Stata 8, there are often 
complaints about publication quality tables and regression outputs, with 
some people suggesting SPSS as a Stata alternative for tables.

I want to suggest to the Stata community extensions to the way Stata 
handles variable labels, as it I think such additions can lead to better 
looking tables and other outputs.

I want to suggest 3 new labels as options and additions to Stata's 
-variable label- Many of these comments may reflect my personal usage, 
but I suspect that there is a generalizability here that may be useful 
to all.

-label- may have many uses, only one of which may be the production of a 
publication quality labels. In the 2 examples below, the variables names 
are continued for historical compatibility reasons and the labels 
provide different types of information,

variable name   type   format    variable label
-------------------------------------------------
fatigue_        float  %9.0g     Sx-fatigue
haq_disa        float  %9.0g     Disability Index

Label 1 tells me that the fatigue variable came from the symptoms 
section and label 2 that the variable is one of many disability indexes. 
I cant use any of these label for publication, however. So, if I copy a 
table using either varname or label, I have to reformat in my word 
processor.

What I think is needed is a publication or "table label." This can be 
done, for example, by using variable characteristics:

. char list haq[tlabel]
haq_disa[tlabel]:           HAQ (0-3)

. char list fatigue[tlabel]
fatigue_[tlabel]:           Fatigue (0-10)

I used this in a program (on SSC) called -fsum-:

       Variable |        N     Mean  
SD


----------------+---------------------------
 Fatigue (0-10) |     6309     4.33     2.88 
      HAQ (0-3) |     6270     1.08     0.72

The output is publication ready.

These table labels, however, are not of much use for column labels in 
tables, as they are much too long. In Stata, the -list- command contains 
an option to display as a column label the text in char 
varname[varname]. With the help of Nick Cox, I wrote a program called 
-corrtab- that will be placed on the SSC Archives on Kit Baum's return 
next week. This program, that display correlations, is an example of the 
use of tlabels (table labels) and clabels (column labels) together. Here 
are some examples.


No labels
. corrtab haq pain glb age sleep,v(3)

    Pearson correlations

  +-------------------------------------------+
  | Variable   haq_disa   pain_sca   glb_seve |
  |-------------------------------------------|
  | haq_disa     1.000      0.609      0.591  |
  | pain_sca     0.609      1.000      0.664  |
  | glb_seve     0.591      0.664      1.000  |
  |      age     0.123     -0.042      0.028  |
  | sleep_sc     0.411      0.505      0.507  |
  +-------------------------------------------+

clables in columns
. corrtab haq pain glb age sleep,v(3) c

    Pearson correlations

  +--------------------------------------+
  | Variable     HAQ      Pain    Global |
  |--------------------------------------|
  | haq_disa   1.000     0.609    0.591  |
  | pain_sca   0.609     1.000    0.664  |
  | glb_seve   0.591     0.664    1.000  |
  |      age   0.123    -0.042    0.028  |
  | sleep_sc   0.411     0.505    0.507  |
  +--------------------------------------+

clabels in columns and rows
. corrtab haq pain glb age sleep,v(3) all

    Pearson correlations

  +--------------------------------------+
  | Variable     HAQ      Pain    Global |
  |--------------------------------------|
  |      HAQ   1.000     0.609    0.591  |
  |     Pain   0.609     1.000    0.664  |
  |   Global   0.591     0.664    1.000  |
  |      Age   0.123    -0.042    0.028  |
  |    Sleep   0.411     0.505    0.507  |
  +--------------------------------------+

tlabels in rows, clabels in columns
. corrtab haq pain glb age sleep,v(3) t c

    Pearson correlations

  +------------------------------------------------------+
  |                 Variable     HAQ      Pain    Global |
  |------------------------------------------------------|
  |                HAQ (0-3)   1.000     0.609    0.591  |
  |              Pain (0-10)   0.609     1.000    0.664  |
  |   Global severity (0-10)   0.591     0.664    1.000  |
  |              Age (years)   0.123    -0.042    0.028  |
  | Sleep disturbance (0-10)   0.411     0.505    0.507  |
  +------------------------------------------------------+

 Notice that the clabel is short and serves as an identifier rather than 
being very informative.

A third type of label is a graphics label. It usually differs from other 
labels for a variety of reasons.

So, I suggest the label extensions that are carried out by variable 
characteristics:

tlabel
clabel
glabel

The labels suggested are useful for people who repeatedly work with the 
same set of variables. In addition, they give control over the output. 
They would not replace Stata variable labels, but would be extensions.

If Stata were to adopt extensions like these, it might be an additional 
step toward better output throughout its many programs. I could see them 
being used in various regression commands.

I don't know if Statalisters think this is a good idea, but if they do 
it might be useful to develop a consensus regarding what kind of 
extensions there should be and into which char they should be placed.

Perhaps these comments might stimulate discussion on the issues of 
publication quality output and how it might be accomplished.

Fred Wolfe


Fred Wolfe                                             
National Data Bank for Rheumatic Diseases        
Wichita, Kansas                                    
Tel (316) 263-2125     Fax (316) 263-0761   
[email protected]

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Follow-Ups:
- st: RE: Towards publication quality output
  - From: Roger Newson <[email protected]>
Prev by Date: Re: st: New user...trying to use ADO file with SVR
Next by Date: RE: st: Trouble with Win XP/Office XP
Previous by thread: st: (No Subject)
Next by thread: st: RE: Towards publication quality output
Index(es):
- Date
- Thread