Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: combining tables
From
Rebecca Pope <[email protected]>
To
[email protected]
Subject
Re: st: combining tables
Date
Tue, 15 Jan 2013 08:38:14 -0600
Two corrections are due here. The first, and more serious, is an error
of attribution. At the end of Roger's article, he notes in the
Acknowledgements section that the term "resultssets" is due to Nick
Cox. The second error results from similarly poor attention to detail.
-mat summstat = J(6,4,.)- should instead be -mat summstat = J(6,2,.)-.
The original definition was from a first attempt. This doesn't produce
an error, but it does necessitate the extra blanks in ctitles()
between "Domestic" and "Foreign" and "N=52" and "N=22". However, if
you like blank columns between your statistics, no change is
necessary.
Rebecca
On Mon, Jan 14, 2013 at 11:03 AM, Rebecca Pope <[email protected]> wrote:
> All,
> This is an addendum to my post from November 6, 2012. When I
> originally asked my question, John Luke Gallup's -outreg- was
> suggested as a possible solution. I was not able to get it to work in
> isolation. However, John recently published an article in The Stata
> Journal (1) about -frmttable-, which he terms the "engine" of
> -outreg-. Combining -frmttable- and -outreg-, I was able to create
> exactly the sort of table I wanted without reducing my data to what
> Roger Newson terms a "resultsset" (2) or taxing my limited RTF
> capabilities. For those Stata users who, like me, work in an MS
> Word-dominated world, I hope that this is helpful.
>
> In my field of medical and health services research, articles often
> include a description of the study population and as often as not the
> statistics are a mix of percentages and means with standard deviations
> "stacked" in columns representing different treatment, diagnosis, etc
> groups. A good example of such a table that is freely available is
> linked in the post below.
>
> To create such a table using -frmttable- and -outreg- requires using
> the annotate() & asymbol() options to add percent signs and closing
> parentheses to the appropriate places and the use of doubles() & the
> "obscure" dbldiv() option to place the standard deviations next to
> rather than beneath the mean. Different groups may be combined with
> the use of replay(), merge(), and append(). The code, applied rather
> trivially to the auto data, appears below. I'm not claiming this
> example code is the "best" method, especially w.r.t. hardcoding row
> names & number of rows, but it should illustrate the point. I highly
> recommend John's article and the help for -frmttable- for those
> seeking to create a similar table in Word.
>
> *** Code ***
> sysuse auto.dta, clear
> mata: mata clear
> label def reprec 1 "Poor" 2 "Fair" 3 "Good" 4 "Very Good" 5
> "Excellent" 9 "Missing"
> replace rep78=9 if missing(rep78)
> label val rep78 reprec
> qui {
> tabulate rep78, generate(rec)
> mat sumstat = J(`r(r)'+1,1,.)
> mat pcts = 0 \ J(`r(r)',1,1)
> unab reprec: rec?
> foreach f in 0 1 {
> local i=2
> foreach rec of local reprec {
> local lbl: var label `rec'
> local lbl = subinstr("`lbl'","rep78==","",.)
> label var `rec' " `lbl'"
> sum `rec' if foreign==`f', meanonly
> mat sumstat[`i',1] = r(mean)*100
> local i = `i'+1
> }
> matrix rownames sumstat = rep78 `reprec'
> frmttable, statmat(sumstat) replace ///
> varlabels sdec(1) annotate(pcts) asymbol("%") merge(col`f')
> }
> }
> mat dmat=(0,1,0,1)
> mat summstat = J(6,4,.)
> foreach f in 0 1 {
> local i = 2
> foreach v in length weight headroom {
> qui summarize `v' if foreign==`f'
> tempvar `v'
> clonevar ``v'' = `v'
> label var ``v'' " Average `v' (s.d.)"
> mat summstat[`i',1] = r(mean)
> mat summstat[`i',2] = r(sd)
> local i = `i'+2
> }
> matrix rownames summstat = length `length' weight `weight' headroom `headroom'
> matrix pars = (0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1)
> frmttable, statmat(summstat) varlabels doubles(dmat) sdec(1) ///
> dbldiv(" (") annotate(pars) asymbol(")") merge(colc`f')
> }
> outreg, replay(col0) append(colc0) store(sum0)
> outreg, replay(col1) append(colc1) store(sum1)
> outreg, replay(sum0) merge(sum1) ///
> title("Table 1. Sample Descriptive Statistics") ///
> ctitles("" , "Domestic", "", "Foreign" \ "Variable" , "N=52", "", "N=22")
>
> ***end example***
>
> Citations:
> (1) Gallup, JL. (2012) "A programmer's command to build formatted
> statistical tables". The Stata Journal. 12(4):655-673. Available from:
> http://www.stata-journal.com/article.html?article=sg97_5
>
> (2) Newson, RB. (2012) "From resultssets to resultstables in Stata".
> The Stata Journal. 12(2):191-213. Available from:
> http://www.stata-journal.com/article.html?article=st0254
>
> Rebecca
>
> On Tue, Nov 6, 2012 at 11:20 AM, Rebecca Pope <[email protected]> wrote:
>> Thanks again to Nick, Daniel K., and Roger for Stata suggestions. Here
>> is a summary of my experience with your proposed solutions. It is
>> pretty dense, but I've tried to be comprehensive in case anyone else
>> runs into a similar situation since I would say on balance that all
>> approaches had their relative strengths & weaknesses that different
>> users might weight differently than I do.
>>
>> Nick, you were right about -tabout- coming closest to what I want to
>> accomplish in terms of a single command. Particular strengths in my
>> context are the ability to have it treat the list of supplied
>> variables as -tab1- would, rather than producing cross-tabulations or
>> grouping over all possible combinations of the variables in the list.
>> This was the problem I ran into with -collapse-. The survey extensions
>> are also very helpful and, while not applicable in this particular
>> context, often are in my work. If I'm working with strictly
>> categorical variables, this would be my command of choice. The
>> limitation that I have encountered is that I cannot combine continuous
>> and categorical variables into one table with mean (sd) for continuous
>> & N (%) for categorical. For an example, see Table 1 of Pyne et al's
>> How Bad is Depression? Available at
>> <http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2739035/pdf/hesr0044-1406.pdf>,
>> which is a good representation of the sort of table I routinely make
>> (I can't claim credit for this one, but it is from my group & freely
>> accessible).
>>
>> Next, to Daniel Klein's suggestion about -outreg-. Yes, I know about
>> John Gallup's -outreg-, but I have almost exclusively used Ben Jann's
>> -estout- for regression output (Jann B. (2005) Making regression
>> tables from stored estimates. Stata Journal 5(3): 288–308). I'm not
>> going to make any claims about one's superiority over the other, just
>> the relative benefits of familiarity. However, the suggestion about
>> -outreg-'s extended capabilities prompted me to look a little closer
>> to home and think more creatively about using -estout- since I'm more
>> familiar with it. This worked moderately well. The ability to set
>> numeric formats cell-by-cell is especially nice. Unfortunately, as far
>> as I can tell, you also lose a lot of options when your matrix is
>> user-created rather than by Stata-stored estimation results. For
>> example, it does not appear that you can add parentheses or place one
>> element under another. All I can say for certain is that I couldn't
>> when I tried. After this immediate project, I'll look closer at
>> -outreg- and potentially writing a program that saves the matrix in
>> r(), which if I'm reading the -estout- documentation correctly should
>> restore some of the functionality.
>>
>> Then there is the suite of commands that appear in Roger's article...
>> If I'd known what I was getting into, I would have waited on this past
>> my present project too, but once started, I was too stubborn to stop.
>> The level of control is pretty spectacular and it seems, thus far,
>> that the main limitations of Roger's approach are the user's & RTF
>> capabilities (for those who must work in MS Word, etc). Depending on
>> the the user, this could be quite significant. Two days of trial and
>> error and another full day spent on many additional readings and I
>> _finally_ managed to get the RTF output to work correctly. In
>> fairness, Roger's article does disclose this: "RTF tables are less
>> simple to produce..." I have a much greater appreciation now for the
>> commands that do this automatically. In the end though, I wound up
>> with a table that only requires right indents for the data columns,
>> the addition of top & bottom borders, and an empty row (personal
>> preference) above each "gap row" (see -gaprow- in Roger's article).
>> This is substantially less formatting than I did before, so I'm quite
>> happy. Still, I'd say the initial time investment to understand all
>> the intermediate steps was quite high & creating the "resultssets" is
>> going to require project-specific code rather than invoking a single
>> command. Given the time that it takes to paste in values & format
>> tables (especially large ones) in Word, I nevertheless think it is
>> worth it.
>>
>> Once again, thank you to all. I've learned a lot from this exercise.
>>
>> Regards,
>> Rebecca
>>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/