Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: summarize by different levels/groups with -egen- ?
From
Joerg Luedicke <[email protected]>
To
[email protected]
Subject
Re: st: RE: summarize by different levels/groups with -egen- ?
Date
Fri, 11 Jan 2013 12:25:00 -0500
Consider the following:
// Data
clear
input str2 Class str1 Pathogen
A1 H
A1 S
A1 T
A2 S
A2 K
A3 H
A3 D
B1 H
B1 S
end
// Flagging classes with at least one H
bys Class: egen pat2=max(Pathogen=="H")
// To analyze that at class level
bys Class: gen tag=_n==1
keep if tag
Joerg
On Fri, Jan 11, 2013 at 11:39 AM, Patricia Biedermann
<[email protected]> wrote:
> Hello,
> Thank you Lovisa & Nick.
> I've tried your commands, but it seems not to work out the way I want
> to have it. (pathogen is a string variable).
>
> The issue is that, when I creat the dummy variable in the end (as
> described by Lovisa) I will get for each H in one class a "1". When I
> further summarize it, I have the total amount of H. But I want to have
> a total amount of classes, who are affected with H (regardless how
> many children itself were affected by the pathogen).
>
> e.g.
> Class Pathogen
> A1 H
> A1 S
> A1 T
> A2 S
> A2 K
> A3 H
> A3 D
> B1 H
> B1 S 0
>
> Finally --> 3 (out of 4) classes are affected by "H". (I don't care
> about how many individuals in one class!).
>
> Maybe I've to think about it and approach it differently.
> Cheers.
>
> On Fri, Jan 11, 2013 at 1:46 PM, Nick Cox <[email protected]> wrote:
>> You don't need a dummy or indicator variable. Assuming that -pathogen-
>> is a string variable,
>>
>> ... mean(pathogen == "H")
>>
>> will work fine as the -mean()- function of -egen- takes expressions.
>> If it's a numeric variable, the same principle applies, but you need a
>> different expression.
>>
>> Nick
>>
>> On Fri, Jan 11, 2013 at 12:01 PM, Lovisa Persson
>> <[email protected]> wrote:
>>
>>> First create a dummy variable for each pathogen, pathogeni.
>>> Then generate the mean for each class and each pathogen(i) by writing:
>>>
>>> egen meanpathogeni=mean(pathogeni), by(class)
>>>
>>> every class that now has a certain pathogen in it will have a value of
>>> meanpathogeni higher than zero, and every class that do not have a certain
>>> pathogen in it will have a value of zero.
>>> The observation value will be the same within classes, which is the mean
>>> number of the pathogen in this class.
>>>
>>> So now you generate a new dummy variable that equals 1 if the value of
>>> meanpathogeni is higher than one.
>>> Now each class will have the same observation value which will be 1 or 0
>>> depending on whether this class had at least one observation of this
>>> particular pathogen in it.
>>
>> Patricia Biedermann
>>
>>> I want to summarize following:
>>>
>>> School Class Pathogen
>>> A A1 H
>>> A A1 T
>>> A A1 H
>>> A A2 S
>>> A A2 H
>>> A A3 K
>>> A A3 I
>>> B B1 S
>>> B B1 T
>>> B B2 H
>>>
>>> I've visited different classes in different schools. In each class I checked
>>> if the children were infected with some kind of pathogen.
>>> - I found e.g that in class A1 two children were infected with
>>> pathogen H.
>>> - Now, I want to summarize that I just found pathogen H in class A1
>>> WITHOUT the actual amount of pathogen itself (2 times in this case);
>>> Basically "Was pathogen H found in class A1" = yes or no; Finally, the
>>> information should be presented at school level. ("How many classes in
>>> school A pathogen H was found?)
>>>
>>> So far I tried egen, bysort / =_n==N and commands. I also created dummy
>>> variables for each pathogen. It never worked out the right way.
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/