I wasn't paying attention to the implications of the fact that you were
doing this under -by:-, but a very little experimentation and analysis
is enough to clarify what is happening.
Your speculation is the wrong way round. It's not a question of what
-matrix- will "accept"; if there were a problem with that there would be
an error message. It's a question of what is left behind earlier that
-matrix- can get hold of.
Forget Rose Ann Medeiros' -mdesc- and my hack -mmdesc- and focus on
something accessible to all.
. sysuse auto
(1978 Automobile Data)
. bysort foreign : su mpg
------------------------------------------------------------------------
---------
-> foreign = Domestic
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
mpg | 52 19.82692 4.743297 12 34
------------------------------------------------------------------------
-------
-> foreign = Foreign
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
mpg | 22 24.77273 6.611187 14 41
. ret li
scalars:
r(N) = 22
r(sum_w) = 22
r(mean) = 24.77272727272727
r(Var) = 43.70779220779221
r(sd) = 6.611186898567625
r(min) = 14
r(max) = 41
r(sum) = 545
This is typical (although not universal) of how -by:- works with r-class
results. What happens is that
1. -summarize- does the work for Domestic cars, showing the results and
leaving them in r-class results, such as r(N), r(mean), etc.
2. -summarize- does the work for Foreign cars, showing the results and
leaving them in r-class results, such as r(N), r(mean), etc.
What's left behind -- and what you can access therefore -- is just the
_last_ set of r-class results. As it goes through the subsets, Stata
overwrites each set of r-class results with the next. This is generic of
byable commands (in Stata's jargon) written as -recall- commands not
-onecall- commands.
Look at the code for -mdesc- and -mmdesc- and you will see that they are
-recall- commands.
What's to be done? Here are three solutions; there are probably others
too.
1. Rewrite -mdesc- or -mmdesc- to be -onecall- commands. That's a job
for a Stata programmer.
2. Use -statsby-, say with -mdesc-. (If you don't know about -statsby-,
find out anyway.)
3. Call -mmdesc- in a loop and accumulate results. There's more on
technique at
FAQ . . . . . . . . . . Making foreach go through all values of a
variable
8/05 Is there a way to tell Stata to try all values of a
particular variable in a foreach statement without
specifying them?
http://www.stata.com/support/faqs/data/foreach.html
This could be something like
egen which = group(treatment timpoint), label
su which, meanonly
local ng = r(max)
forval i = 1/`ng' {
di "`: label (group) `i''"
mmdesc use1 use2
matrix all = nullmat(all) \ r(table)
}
I am not paying attention to your extra -if- conditions which impart
another level of complexity. You would need to call -mmdesc- for each
distinct set of conditions. After each call to -mmdesc- you would need
to copy the resulting matrix before it was overwritten.
Nick
[email protected]
Shehzad Ali
Thanks, Nick.
For some reason, I am having no joy using the -mat- option. This may be
because I am using -by- option with -mmdes- and -mat- is not willing to
accept (at least in my case) multiple rows that represented different
-by-
categories. I am still getting a single row matrix even for the first
step
before I join two matrices.
I think I will do this in Excel.
On Feb 23 2009, Nick Cox wrote:
>-findit mdesc- reveals that it is
>a program by Rose Anne Medeiros, who once wrote a program called -njc-,
>for which I am not responsible. Here is a matrix version of her
program,
>for which she is not responsible.
>
>*! mmdesc 1.0 NJC 23 February 2009
>* mdesc 1.0 RAM 18 JULY 2008
>* Returns a table with number missing, total, and missing/total
>program mmdesc, rclass byable(recall)
>version 8
>syntax [varlist] [if] [in]
>tempvar touse
>mark `touse' `if' `in'
>local nvars : word count `varlist'
>tempname matrix
>matrix `matrix' = J(`nvars', 3, .)
>local i = 1
>quietly foreach var of local varlist {
> count if missing(`var') & `touse'
> matrix `matrix'[`i', 1] = r(N)
> count if `touse'
> matrix `matrix'[`i', 2] = r(N)
> matrix `matrix'[`i', 3] = `matrix'[`i',1] / `matrix'[`i',2]
> local ++i
>}
>matrix rownames `matrix' = `varlist'
>matrix colnames `matrix' = Missing Total Missing/Total
>matrix list `matrix', noheader
>return matrix table = `matrix'
>end
>
>Once you run this, you can save its results by e.g.
>
>mat A = r(table)
>
>...
>
>mat B = r(table)
>
>...
>
>mat C = A + B
>
>
>Nick
>[email protected]
>
>Shehzad Ali
>
>Another (perhaps) simple question: is it possible to add different
>frequency tables together to form a combined single table in Excel?
>
>I have two group variables (say, var1: type of treatment; and var2:
time
>
>point of observation) and 5 use variables. So an observation would be
>for
>treatment A at time point 1. I want to know the number of missing
values
>
>for each use variable for each treatment group and each time point.
Also
>
>for each use variable there are 'if' conditions, so I can't do the
>frequency count in one go. Here is what I am doing using a user written
>programme -mdesc- or -missing-:
>
>by treatment timpoint: mdesc use1 if (condition1 condition2)
>
>by treatment timpoint: mdesc use2 if (condition3 condition4)
>
>and so on.
>
>
>Now I want to combine all these counts for all timepoints and treatment
>groups to get a single table that I can export to Excel. Is there any
>way I
>could do that?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/