Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Egen to sum across rows (with an if across rows)
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Egen to sum across rows (with an if across rows)
Date
Mon, 29 Apr 2013 09:35:25 +0100
Replacing
replace wardtime`cat' = wardtime`cat' + t`j' if cat`j' == "`cat'"
with
replace wardtime`cat' = wardtime`cat' + t`j' if t`j' < . &
cat`j' == "`cat'"
or
replace wardtime`cat' = wardtime`cat' + max(t`j', 0) if cat`j'
== "`cat'"
are other fixes.
Nick
[email protected]
On 29 April 2013 09:29, Lucy GELDER <[email protected]> wrote:
> Many thanks Nick. The code you suggested worked after I tweaked the data to accommodate the missing values I had in both the time and category columns. I did this by:
>
> forvalues i = 1/157{
> replace t`i'=0 if t`i'==.
> replace cat`i'="Z" if cat`i'==""
> }
>
> Lucy
> ________________________________________
> From: [email protected] [[email protected]] On Behalf Of Nick Cox [[email protected]]
> Sent: Monday, 29 April 2013 3:38 PM
> To: [email protected]
> Subject: Re: st: Egen to sum across rows (with an if across rows)
>
> You are correct. Wildcards cannot be used in -if- qualifiers (or -if-
> commands for that matter).
>
> Your syntax needs fixing in other ways. You use -cat- in the -foreach-
> statement but don't refer to it in the loop. That's not illegal in
> itself, but the code couldn't do what you want.
>
> You state different variable names in different places, but the spirit
> of what you want seems clear.
>
> Try this:
>
> foreach cat in A B C D {
>
> gen wardtime`cat' = 0
>
> qui forval j = 1/157 {
> replace wardtime`cat' = wardtime`cat' + t`j' if cat`j' == "`cat'"
> }
>
> }
>
> That's assuming variables -t1-t157- -cat1-cat157-
>
> There is a general review of technique in this territory in
>
> SJ-9-1 pr0046 . . . . . . . . . . . . . . . . . . . Speaking Stata: Rowwise
> (help rowsort, rowranks if installed) . . . . . . . . . . . N. J. Cox
> Q1/09 SJ 9(1):137--157
> shows how to exploit functions, egen functions, and Mata
> for working rowwise; rowsort and rowranks are introduced
>
> .pdf at http://www.stata-journal.com/sjpdf.html?articlenum=pr0046
>
> Nick
> [email protected]
>
>
> On 29 April 2013 08:18, Lucy GELDER <[email protected]> wrote:
>
>> I have a dataset which includes 157 columns of times in hours (t1-t157) and 157 columns of categories, with values A - D (cat1-cat157).
>>
>> I want to sum across the columns by category, so that I end up with four columns timeA-timeD containing the total times for each category.
>>
>> I have tried:
>>
>> foreach cat in A B C D{
>>
>> egen wardtimeA= rowtotal(wardtime*) if (wardcat*)=="A"
>>
>> }
>>
>> and get the error "wardcat* invalid name". I presume this means I can't use the wild card in the if statement?
>>
>> Does anyone know of a way I can do this without reshaping to data long.....this is a very large dataset and I would prefer to keep it wide if possible.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/