Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: summarize by different levels/groups with -egen- ?
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: RE: summarize by different levels/groups with -egen- ?
Date
Fri, 11 Jan 2013 12:46:17 +0000
You don't need a dummy or indicator variable. Assuming that -pathogen-
is a string variable,
... mean(pathogen == "H")
will work fine as the -mean()- function of -egen- takes expressions.
If it's a numeric variable, the same principle applies, but you need a
different expression.
Nick
On Fri, Jan 11, 2013 at 12:01 PM, Lovisa Persson
<[email protected]> wrote:
> First create a dummy variable for each pathogen, pathogeni.
> Then generate the mean for each class and each pathogen(i) by writing:
>
> egen meanpathogeni=mean(pathogeni), by(class)
>
> every class that now has a certain pathogen in it will have a value of
> meanpathogeni higher than zero, and every class that do not have a certain
> pathogen in it will have a value of zero.
> The observation value will be the same within classes, which is the mean
> number of the pathogen in this class.
>
> So now you generate a new dummy variable that equals 1 if the value of
> meanpathogeni is higher than one.
> Now each class will have the same observation value which will be 1 or 0
> depending on whether this class had at least one observation of this
> particular pathogen in it.
Patricia Biedermann
> I want to summarize following:
>
> School Class Pathogen
> A A1 H
> A A1 T
> A A1 H
> A A2 S
> A A2 H
> A A3 K
> A A3 I
> B B1 S
> B B1 T
> B B2 H
>
> I've visited different classes in different schools. In each class I checked
> if the children were infected with some kind of pathogen.
> - I found e.g that in class A1 two children were infected with
> pathogen H.
> - Now, I want to summarize that I just found pathogen H in class A1
> WITHOUT the actual amount of pathogen itself (2 times in this case);
> Basically "Was pathogen H found in class A1" = yes or no; Finally, the
> information should be presented at school level. ("How many classes in
> school A pathogen H was found?)
>
> So far I tried egen, bysort / =_n==N and commands. I also created dummy
> variables for each pathogen. It never worked out the right way.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/