Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Proportions


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: Proportions
Date   Tue, 25 Jan 2005 11:04:05 -0000

As I understand it, your structure is 

day year stock return 

with one value of -return- for each -day- and
-stock-. -day- is naturally nested within -year-. 

If so, the number of days with -return- less than 5 is

. bysort stock year : gen low_days = sum(return < 5) 
. by stock year : replace low_days = low_days[_N] 

and the total number of days for each combination 
is 

. by stock year : gen no_days = _N 

and so 

. gen prop_low_days = low_days / no_days 

except that we should be able to telescope this to 

. bysort stock year : 
	gen prop_low_days = sum(return < 5) 
. by stock year : 
	replace prop_low_days = prop_low_days[_N] / _N 

Note my continuation lines. Also, I cut down on 
the number of variables, and the name doesn't 
match the contents until I'm done. 

If there are no missing values of -return- 
we would need to be more circumspect. 

. bysort stock year : gen low_days = sum(return < 5) 
. by stock year : gen prop_low_days = sum(return < .) 
. by stock year : 
	replace prop_low_days = low_days[_N] / prop_low_days[_N]  

Also, if you wanted to count proportions of high
values of -return- you would need to 
watch that (e.g.) -sum(return > 10)- will catch 
any missings as well. 

What about -egen-? Clearly you can do it that way. 
Sometimes, indeed often, drilling down one level
to get the elementary building blocks is in 
fact easier. I know one extremely advanced 
user of Stata who hates -egen-, I think because
by the time he has looked up the syntax he 
could have ground it all out from first 
principles with some -by:- footwork. But he
is very fast with Stata, having used it 
since the beginning. 

Note that your 

gen lo = 0
replace lo = 1 if ret < -5
egen temp = count(lo), by(stock year) 
egen temp2 = sum(lo), by(stock year) 

could be done this way: 

egen temp = sum(1), by(stock year) 
egen temp2 = count(ret < 5), by(stock year) 

(I don't understand why you have -5.) 

There was a tutorial on -by:- in Stata Journal 
2(1) 2002. 

Nick 
[email protected] 

Yvonne Capstick
 
> I have a hopefully simple question on calculating proportions.
> 
> I have daily returns (ret) for different stocks (stock) and I 
> would like to 
> calculate the proportion of days for which a firm's daily 
> stock return was 
> below 5% over the last 3 calendar years.
> 
> If all I needed was the proportion of trading days for which 
> the return was 
> below 5% over the last 1 calendar year, I could calculate this by the 
> following long-winded method:
> 
> gen lo = 0
> replace lo = 1 if ret < -5
> egen temp = count(lo), by (stock year)
> egen temp2 = sum(lo), by (stock year)
> gen prop = temp2/temp
> gen temp3 = prop[_n-1] if month == 1 & month[_n-1] == 12 & year == 
> year[_n-1]+1
> egen lastprop = sum(temp3), by (stock year)
> 
> a) There must be a faster way of doing the above - I tried 
> something like 
> egen prop = count(lo)/sum(lo), by (stock year) but it said 
> 'varlist not 
> allowed". Please could you advise me of any faster way?
> b) How do I modify the above to calculate the proportion of 
> trading days 
> where the return was < 5% over the last 3 calendar years?
> 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index