Jacob,
2008/10/8 Jacob Wegelin <[email protected]>:
> Given any dataset of all numeric variables, I want to generate a new
> variable called myMean, which is the arithmetic mean (the average) across
> all the variables. The program below solves this problem. But surely there
> is a one-line command that will perform this task in Stata?
>
> The post http://www.stata.com/statalist/archive/2008-09/msg00597.html
> appears to contain a bug, in the sense that the row total computed is not
> corrected as in my code below.
>
> This should be done in a general manner:
>
> (1) As in the current dataset, the variables will not necessarily be in a
> form like a1 to a100.
>
> (2) The number of variables is arbitrary, so I cannot hard-code the
> denominator as when myMeanByHand is computed below.
>
> (3) If any value in a row is missing (.), the mean computed must also be
> missing, since then the mean across all variables is not defined. (Thus egen
> rowtotal is not the answer.)
I might be misunderstanding you, but wouldn't
qui ds
local varlist `r(varlist)'
egen miss = rowmiss(_all)
egen mean = rowmean(`varlist') if miss==0
be a solution? It will calculate the average you want, in a flexible
manner, but only if there are no missings anywhere in the observation.
>
> /* A related question: The following gives an incorrect answer. What in the
> world is it doing? */
> egen junk=rowmean(*) list
>
Interesting one. I did a test, with the auto data:
sysuse auto
drop make
egen junk = rowmean(_all)
sum junk
drop junk
egen junk2 = rowmean(*)
sum junk2
and also received different results. Setting -trace- on revealed that
in the second case, Stata seems to be including a temporary variable
(sort order?) into the calculation. It does look like a bug to me.
HTH,
Eva
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/