Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Programming Repetition for categories
From
Tim <[email protected]>
To
[email protected]
Subject
Re: st: Programming Repetition for categories
Date
Thu, 03 Oct 2013 13:06:30 +1000
I would probably make a variable for the categories, then use -by-
. egen cat_SHARE_DEP = cut(SHARE_DEP), at(0, 10000000, 20000000, 500000,
100000000, 250000000, 1000000000, 10000000000), label
. foreach avg in Q BRANCH A TYPE P MEMB_TOT {
. bys cat_SHARE_DEP: egen avg_`avg' = mean(`avg')
. }
The if you really want separate variables for the different means, you
can separate them later, but it's probably not necessary. It will
probably be easier to work with -by- and/or -if- to select the category
means you want.
As for your code, the -if SHARE_DEP- command refers to the value of
SHARE_DEP in the first observation, so only one of your if clauses will
ever run, and if when it runs it will operate on the whole dataset as
you have not used a subsetting -if- in the -egen- command.
See [U] 11.1.3 if exp and [P] if
Tim BP
On 3/10/2013 12:45, Andrew Hovel wrote:
I am trying to program repeated calculation of means for my a set of
variables categorized in bins. I am using Stata 12 for windows.
I am new to Stata programming, so I'm guessing there is a better way
to do this than I am attempting, but here goes:
I am calculating means of six variables (Q BRANCH A TYPE P MEMB_TOT)
in my data across 7 different categories of another variable,
SHARE_DEP (represents a value of total shares and deposits held by
credit unions)
The categories I use are 0-10million, 10-20million, 20-50 million,
50-100million, 100-250million, 250m-1billion, and >1billion
The code I am using is:
***average <10m
if SHARE_DEP < 10000000 {
foreach average in Q BRANCH A TYPE P MEMB_TOT {
egen avg010_`average' = mean(`average')
}
}
***average 10-20m
if SHARE_DEP >= 20000000 & SHARE_DEP < 50000000 {
foreach avg in Q BRANCH A TYPE P MEMB_TOT{
egen avg2050_`avg' = mean(`avg')
}
}
***
and so forth through those >1billion.
The problem here is that the means generated for the first step are
equivalent to the whole population mean, not the mean for observations
where SHARE_DEP < 10000000. (I checked this separately using -sum- for
the variables after dropping all observations where SHARE_DEP >
10000000.)
The subsequent if programs don't even execute.
Any help or suggestions for resolving this would be great.
-AH
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/