One way is to do this is to loop over possibilities.
The criterion appears to be
this year and the two previous years
For each year, mod(year, 3) is 0, 1 or 2.
There are correspondingly three ways
of dividing your period into three-
year blocks that do not overlap,
according to whether the beginning
year has mod(year,3) of 0, 1, 2.
The end year has mod(year,3) of 2, 0, 1
respectively.
Let's initialise our average of the
this year and the two previous.
gen previous = .
Now we loop over the three possibilities,
setting up a begin year, and copying
it downwards. Then the proportion
is just the mean of an expression
yielding a Boolean result,
as I should have remembered earlier,
and we put the results where they belong.
qui forval m = 0/2 {
gen begin = year if mod(year,3) == `m'
bysort stock (year) : replace begin = begin[_n-1] if mi(begin)
egen work = mean(return < 5), by(stock begin)
replace previous = work if mod(year, 3) == mod(`m' + 2, 3)
drop begin work
}
This is all rather ad hoc and not really tested!
Nick
[email protected]
Yvonne Capstick
> Thanks very much for that reply. I've now been able to calculate the
> proportion of days with returns below 5 efficiently, via your method.
>
> My structure is now:
>
> day year stock return prop_low_days
>
> where prop_low_days is constructed via your method, and
> therefore gives the
> proportion of days in that year where the stock returned less
> than 5%. It is
> thus the same figure for every day in a particular year for a
> particular
> stock.
>
> Now I would like to construct a very similar measure to the
> above, but
> giving the proportion of days in the past three years where the stock
> returned less than 5%. So for 24 Feb 02 I would like the
> newvar to give the
> proportion of days in 00-02 where the stock returned < 5%;
> for 7 Jul 01 I
> would like newvar to give the proportion of days in 99-01
> where the stock
> returned < 5%.
>
> I am not sure of a simple way to do this, because the "by
> year" structure
> doesn't work easily. I'm not sure how to "pick out" the 00,
> 01 and 02 values
> of prop_low_days to take a simple average of these (and I
> don't think a
> simple average would work because of the different number of
> days in each
> year).
>
> Thanks,
> Yvonne
>
> >From: "Nick Cox" <[email protected]>
> >Reply-To: [email protected]
> >To: <[email protected]>
> >Subject: st: RE: Proportions
> >Date: Tue, 25 Jan 2005 11:04:05 -0000
> >
> >As I understand it, your structure is
> >
> >day year stock return
> >
> >with one value of -return- for each -day- and
> >-stock-. -day- is naturally nested within -year-.
> >
> >If so, the number of days with -return- less than 5 is
> >
> >. bysort stock year : gen low_days = sum(return < 5)
> >. by stock year : replace low_days = low_days[_N]
> >
> >and the total number of days for each combination
> >is
> >
> >. by stock year : gen no_days = _N
> >
> >and so
> >
> >. gen prop_low_days = low_days / no_days
> >
> >except that we should be able to telescope this to
> >
> >. bysort stock year :
> > gen prop_low_days = sum(return < 5)
> >. by stock year :
> > replace prop_low_days = prop_low_days[_N] / _N
> >
> >Note my continuation lines. Also, I cut down on
> >the number of variables, and the name doesn't
> >match the contents until I'm done.
> >
> >If there are no missing values of -return-
> >we would need to be more circumspect.
> >
> >. bysort stock year : gen low_days = sum(return < 5)
> >. by stock year : gen prop_low_days = sum(return < .)
> >. by stock year :
> > replace prop_low_days = low_days[_N] / prop_low_days[_N]
> >
> >Also, if you wanted to count proportions of high
> >values of -return- you would need to
> >watch that (e.g.) -sum(return > 10)- will catch
> >any missings as well.
> >
> >What about -egen-? Clearly you can do it that way.
> >Sometimes, indeed often, drilling down one level
> >to get the elementary building blocks is in
> >fact easier. I know one extremely advanced
> >user of Stata who hates -egen-, I think because
> >by the time he has looked up the syntax he
> >could have ground it all out from first
> >principles with some -by:- footwork. But he
> >is very fast with Stata, having used it
> >since the beginning.
> >
> >Note that your
> >
> >gen lo = 0
> >replace lo = 1 if ret < -5
> >egen temp = count(lo), by(stock year)
> >egen temp2 = sum(lo), by(stock year)
> >
> >could be done this way:
> >
> >egen temp = sum(1), by(stock year)
> >egen temp2 = count(ret < 5), by(stock year)
> >
> >(I don't understand why you have -5.)
> >
> >There was a tutorial on -by:- in Stata Journal
> >2(1) 2002.
> >
> >Nick
> >[email protected]
> >
> >Yvonne Capstick
> >
> > > I have a hopefully simple question on calculating proportions.
> > >
> > > I have daily returns (ret) for different stocks (stock) and I
> > > would like to
> > > calculate the proportion of days for which a firm's daily
> > > stock return was
> > > below 5% over the last 3 calendar years.
> > >
> > > If all I needed was the proportion of trading days for which
> > > the return was
> > > below 5% over the last 1 calendar year, I could calculate
> this by the
> > > following long-winded method:
> > >
> > > gen lo = 0
> > > replace lo = 1 if ret < -5
> > > egen temp = count(lo), by (stock year)
> > > egen temp2 = sum(lo), by (stock year)
> > > gen prop = temp2/temp
> > > gen temp3 = prop[_n-1] if month == 1 & month[_n-1] == 12 & year ==
> > > year[_n-1]+1
> > > egen lastprop = sum(temp3), by (stock year)
> > >
> > > a) There must be a faster way of doing the above - I tried
> > > something like
> > > egen prop = count(lo)/sum(lo), by (stock year) but it said
> > > 'varlist not
> > > allowed". Please could you advise me of any faster way?
> > > b) How do I modify the above to calculate the proportion of
> > > trading days
> > > where the return was < 5% over the last 3 calendar years?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/