Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: creating variables using 'by' for subsets of records
From
John Westbury <[email protected]>
To
[email protected]
Subject
Re: st: RE: creating variables using 'by' for subsets of records
Date
Tue, 23 Feb 2010 20:45:15 -0600
Thanks much for the feedback. Here is an example of what the data looks
like that I am using:
Individual region Indicator A 1 0 B 1 1 C 2 1 D 2 1
I have encoded the regions and the ratio I am attempting to create would be
intuitively expressed as:
by region: count of indicator==1/count of individual.
I am trying to create a variable for the numerator by region (call it y) and
denominator by region (call it x) and then use gen ratio=y/x.
I can create a variable (x) for the denominator using; bys region: egen
x=count(Indicator).
I am having trouble creating a variable for the numerator. I have attempted
to use bys region: egen y=count if Indicator==1 but receive an invalid
syntax error. If someone has a suggestion on how to specify a variable for
a count of indicator==1 by region I would be very appreciative.
As an aside, is there a way to specify the variable y/x without specifying y
and x?
thanks
John
On Tue, Feb 23, 2010 at 2:29 PM, Martin Weiss <[email protected]> wrote:
>
> <>
>
> In the absence of example data, it is hard to give you advice. Look at this
> calculation of regional unemployment rates:
>
>
> *******
> clear*
>
> //10 regions
> set obs 10
> gen byte region=_n
>
> //50 indiv per region
> expand 50
> bys region: gen byte id=_n
> gen byte unemployed=runiform()>.9
>
> bys region: gen number=_N
> by region: egen numofunempl=total(unemployed)
>
> gen unemprate=numofunempl/number
> *******
>
> HTH
> Martin
>
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of John Westbury
> Sent: Dienstag, 23. Februar 2010 20:55
> To: [email protected]
> Subject: st: creating variables using 'by' for subsets of records
>
> Hello,
>
> I have records for individuals by geographic region and wish to aggregate
> the records for individuals to records for geographic regions. I believe I
> should create variables for those regions using 'by'. Ex: by Region gen x
> =
> argument for variable. I am having difficulty with arguments for variable
> x. For example I wish to create a region variable that expresses a ratio
> of
> count of indicator values for individuals in a region to a count of
> individuals in the region and am unsure how to code this.
>
> thanks
>
> John
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/