This is just to spell out -- although it is detailed in the FAQ I
alluded to -- that the row median of three variables p1, p2, p3, and
thus the row mode, is just
. gen rowmode = p1 + p2 + p3 - min(p1, p2, p3) - max(p1, p2, p3)
In words: find the sum and subtract the minimum and maximum. Whatever
remains is the value in the middle. This also works if there are any
ties.
I still remember seeing this done in code bundled with Larry Hamilton,
Statistics with Stata (1990), and very likely written by Bill Gould, and
thinking "Yes, of course, I should have seen that for myself when I
needed it earlier". So, some list members may enjoy the trick even if
they never want this beast.
The only extra details arise if you wanted to insist that a mode could
only be a value that occurred at least twice, or if you wanted some rule
for determining a mode from three distinct values, but the code is not
difficult in either case.
Nick Cox
I believe you are correct. There is no obvious row mode function and
-collapse- does not support modes.
However, in your specific problem, there is an easy trick. The mode of 3
might as well be the median of 3. If any value occurs twice or three
times in 3 values, it will also be the median. If each distinct value
occurs just once, the mode might as well be the median too, unless for
some reason you want to insist that a mode is a value that is repeated.
Thus,
. search row median
for a nauseatingly long-winded FAQ on that problem with a pointer to an
efficient user-written -egen- function.
For rows with more than 3 values, do something like this. Example for
variables p1-p4. This is crude code, and will be slow, but I think it
generalises without pain.
gen foo = .
gen mode = .
qui forval i = 1/`=_N' {
forval j = 1/4 {
replace foo = p`j'[`i'] in `j'
}
egen bar = mode(foo) in 1/4
replace mode = bar[1] in `i'
drop bar
}
Nick
[email protected]
Fernando H. Andrade
I am trying to compute the equivalent to a rowmode, a function similar
to
rowmean in the egen command. i think the ado for a rowmode function
(which
may compute the mode across several variables) does not exist, does it?
as
an alternative i thought to reshape the data and collapse it using the
mode
but the mode is not a function to collapse the data.
is there a way to compute the mode across a set of variables for each
subject?
to be more concrete:
this is the structure of the data
subject p1 p2 p3
1 2 2 4
2 1 1 3
3 2 2 4
4 1 1 1
5 3 4 4
i would like to generate a new variable containing the mode across p1 to
p3
for each one of the subjects such that for subject 1 the generated value
would be 2, and 1, 2, 1,4 for subjects 2,3,4,5 respectively.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/