Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: egen rowmean, loops and if
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: egen rowmean, loops and if
Date
Wed, 6 Apr 2011 09:08:51 +0100
I left that open. In many ways the message is that you might well be
better off with the -reshape-d structure. Just as the problem you
posed is awkward with your data structure, so would many other
problems be.
Two footnotes:
1. Note that
if x >= 1.5
catches any missings too. Safer code would be
if x >= 1.5 & x < .
or any equivalent.
2. There is a broader discussion at
SJ-9-1 pr0046 . . . . . . . . . . . . . . . . . . . Speaking Stata: Rowwise
(help rowsort, rowranks if installed) . . . . . . . . . . . N. J. Cox
Q1/09 SJ 9(1):137--157
shows how to exploit functions, egen functions, and Mata
for working rowwise; rowsort and rowranks are introduced
On Wed, Apr 6, 2011 at 3:33 AM, Thomas Speidel <[email protected]> wrote:
> Thanks Nik. I think I will choose the reshape option: much more appealing.
> You mentioned the word "temporarily" earlier. I am aware of
> preserve/restore. In general, what advice what you give (I have some 150
> variables): reshape the whole dataset back and forth?
>
>> ------------------------------------------------------------------------
>>
>> Nick Cox <mailto:[email protected]>
>> April-05-11 18:09
>>
>>
>> Here is example code for a -reshape- solution.
>>
>> clear
>> set obs 10
>> forval j = 1/3 {
>> forval i = 1/8 {
>> gen occ_met`j'_`i' = runiform()
>> }
>> }
>>
>> gen id = _n
>> reshape long occ_met, i(id) string
>> split _j, parse(_) destring
>> rename _j1 i
>> rename _j2 j
>> egen mean1 = mean(occ_met) if occ_met > 0.5 , by(j)
>> egen mean2 = mean(occ_met) if occ_met <= 0.5 , by(j)
>>
>> Nick Cox <mailto:[email protected]>
>> April-05-11 17:59
>>
>>
>> Here is example code for a long-winded solution:
>>
>> clear
>> set obs 10
>> forval j = 1/3 {
>> forval i = 1/8 {
>> gen occ_met`j'_`i' = runiform()
>> }
>> }
>> ds
>>
>> forval i = 1/8 {
>> gen mean1_`i' = 0
>> gen mean2_`i' = 0
>> gen n1_`i' = 0
>> gen n2_`i' = 0
>> qui forval j = 1/3 {
>> replace mean1_`i' = mean1_`i' + occ_met`j'_`i' if occ_met`j'_`i' > 0.5
>> replace n1_`i' = n1_`i' + (occ_met`j'_`i' > 0.5)
>> replace mean2_`i' = mean2_`i' + occ_met`j'_`i' if occ_met`j'_`i' <= 0.5
>> replace n2_`i' = n2_`i' + (occ_met`j'_`i' <= 0.5)
>> }
>> replace mean1_`i' = mean1_`i' / n1_`i'
>> replace mean2_`i' = mean2_`i' / n2_`i'
>> }
>>
>>
>> Nick Cox <mailto:[email protected]>
>> April-05-11 17:37
>>
>>
>> This would be a lot easier if you -reshape-d, even temporarily.
>>
>> Otherwise, with this data structure: -egen, rowmean()- is a
>> non-starter here and I think you need to work at a lower level,
>> building up sums and counts and deriving means.
>>
>> A side-detail is that -foreach- is not needed here: use -forval- instead.
>>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/