Jacob,
2008/10/8 Jacob Wegelin <[email protected]>:
Given any dataset of all numeric variables, I want to generate a new
variable called myMean, which is the arithmetic mean (the average) across
all the variables. The program below solves this problem. But surely
there
is a one-line command that will perform this task in Stata?
The post http://www.stata.com/statalist/archive/2008-09/msg00597.html
appears to contain a bug, in the sense that the row total computed is not
corrected as in my code below.
This should be done in a general manner:
(1) As in the current dataset, the variables will not necessarily be in a
form like a1 to a100.
(2) The number of variables is arbitrary, so I cannot hard-code the
denominator as when myMeanByHand is computed below.
(3) If any value in a row is missing (.), the mean computed must also be
missing, since then the mean across all variables is not defined. (Thus
egen
rowtotal is not the answer.)
I might be misunderstanding you, but wouldn't
qui ds
local varlist `r(varlist)'
egen miss = rowmiss(_all)
egen mean = rowmean(`varlist') if miss==0
be a solution? It will calculate the average you want, in a flexible
manner, but only if there are no missings anywhere in the observation.
/* A related question: The following gives an incorrect answer. What in
the
world is it doing? */
egen junk=rowmean(*) list
Interesting one. I did a test, with the auto data:
sysuse auto
drop make
egen junk = rowmean(_all)
sum junk
drop junk
egen junk2 = rowmean(*)
sum junk2
and also received different results. Setting -trace- on revealed that
in the second case, Stata seems to be including a temporary variable
(sort order?) into the calculation. It does look like a bug to me.
HTH,
Eva
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/