The experts have spoken. Yet, I just want to point to a useful command
that was left out of the discussion when it comes to saving in local
macros descriptive statistics (stats for short) for more than one
variable. I had a problem similar to Ashim's where I wanted the mean
for several variables. I dealt with the problem using SUMMARIZE in a
loop. But, I kept wondering why is that SUMMARIZE saves results only
for the last variable when a varlist is specified? Does not StataCorp
think that one might want returned results for all the variables in
the varlist? I found the answer to these questions in TABSTAT which
offers a lot more flexibility. If one wants to save in local macros
several stats for several variables I think TABSTAT with the SAVE
option is not to be overlooked. TABSTAT saves results in matrices.
Thus, to get the most out of it, some background on macro and matrix
manipulation may be a big plus.
Let me give some examples using the auto dataset:
Suppose I want to save in locals the mean for price, mpg, and weight; I type in:
sysuse auto
qui tabstat price mpg weight, save // mean is the default
The results are saved in a 1x3 matrix called r(StatTotal). I put each
mean in a local macro and then display them:
**********
mat mym=r(StatTotal) // I don't know why I cannot work with
r(StatTotal) itself for some operations
loc i=1
foreach v in `: colnames mym' {
loc m_`v'=mym[1,`i'] // Here for example, r(StatTotal) fails to
behave as a regular matrix.
loc ++i
}
di "mean_price=" `m_price', "mean_mpg="`m_mpg', "mean_wght=" `m_weight'
********
Now suppose I want more than one statistics, say mean and max:
qui tabstat price mpg weight, stat(mean max) save
Now TABSTAT returns a 2x3 matrix (rows are the stats and columns are
variables) that I can use to pick on the mean and/or the max of any
variable. First I put each element in a local macro, so I can easily
pick on any stats I want. For illustration, I pick a few stats to
display. As above, I try to be as general as possible.
******
mat mstat=r(StatTotal)
loc i=1
foreach s in `: rownames mstat' {
loc j=1
foreach v in `: colnames mstat' {
loc `s'_`v'=mstat[`i',`j']
loc ++j
}
loc ++i
}
di "mean_price=" `mean_price', "mean_mpg="`mean_mpg', "max_wght=" `max_weight'
*****
For verification, I display the contents of the matrix:
mat list mstat
If I want these stats by car type, I type in:
tabstat price mpg weight, by(foreign) stat(mean max) save
Now three 2x3 matrices are returned. One for each car type and one for
the overall (unconditional) mean and max. I can choose a matrix and
pick on any element, row, or column that I want.
To display the name of the matrices, type in: ret list
For more on tabstat, type in: help tabstat
--
P. Wilner Jeanty, Post-Doctoral Researcher
Dept. of Agricultural, Environmental, and Development Economics
The Ohio State University
2120 Fyffe Road
Columbus, Ohio 43210
(614) 292-6382 (Office)
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/