As I understand it, your structure is
day year stock return
with one value of -return- for each -day- and
-stock-. -day- is naturally nested within -year-.
If so, the number of days with -return- less than 5 is
. bysort stock year : gen low_days = sum(return < 5)
. by stock year : replace low_days = low_days[_N]
and the total number of days for each combination
is
. by stock year : gen no_days = _N
and so
. gen prop_low_days = low_days / no_days
except that we should be able to telescope this to
. bysort stock year :
gen prop_low_days = sum(return < 5)
. by stock year :
replace prop_low_days = prop_low_days[_N] / _N
Note my continuation lines. Also, I cut down on
the number of variables, and the name doesn't
match the contents until I'm done.
If there are no missing values of -return-
we would need to be more circumspect.
. bysort stock year : gen low_days = sum(return < 5)
. by stock year : gen prop_low_days = sum(return < .)
. by stock year :
replace prop_low_days = low_days[_N] / prop_low_days[_N]
Also, if you wanted to count proportions of high
values of -return- you would need to
watch that (e.g.) -sum(return > 10)- will catch
any missings as well.
What about -egen-? Clearly you can do it that way.
Sometimes, indeed often, drilling down one level
to get the elementary building blocks is in
fact easier. I know one extremely advanced
user of Stata who hates -egen-, I think because
by the time he has looked up the syntax he
could have ground it all out from first
principles with some -by:- footwork. But he
is very fast with Stata, having used it
since the beginning.
Note that your
gen lo = 0
replace lo = 1 if ret < -5
egen temp = count(lo), by(stock year)
egen temp2 = sum(lo), by(stock year)
could be done this way:
egen temp = sum(1), by(stock year)
egen temp2 = count(ret < 5), by(stock year)
(I don't understand why you have -5.)
There was a tutorial on -by:- in Stata Journal
2(1) 2002.
Nick
[email protected]
Yvonne Capstick
> I have a hopefully simple question on calculating proportions.
>
> I have daily returns (ret) for different stocks (stock) and I
> would like to
> calculate the proportion of days for which a firm's daily
> stock return was
> below 5% over the last 3 calendar years.
>
> If all I needed was the proportion of trading days for which
> the return was
> below 5% over the last 1 calendar year, I could calculate this by the
> following long-winded method:
>
> gen lo = 0
> replace lo = 1 if ret < -5
> egen temp = count(lo), by (stock year)
> egen temp2 = sum(lo), by (stock year)
> gen prop = temp2/temp
> gen temp3 = prop[_n-1] if month == 1 & month[_n-1] == 12 & year ==
> year[_n-1]+1
> egen lastprop = sum(temp3), by (stock year)
>
> a) There must be a faster way of doing the above - I tried
> something like
> egen prop = count(lo)/sum(lo), by (stock year) but it said
> 'varlist not
> allowed". Please could you advise me of any faster way?
> b) How do I modify the above to calculate the proportion of
> trading days
> where the return was < 5% over the last 3 calendar years?
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/