Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: RE: Calculate variances of subsamples
From
"Martin Weiss" <[email protected]>
To
<[email protected]>
Subject
RE: st: RE: Calculate variances of subsamples
Date
Sat, 5 Jun 2010 22:50:57 +0200
<>
So, Lars, this is getting quite challenging, not least because we do not
have your data. The code you posted seems to assume the presence of your
data (what is "pricemsci", for instance?) The -reshape-ing we did a couple
of hours ago now increasingly is looking like a bit of a red herring, and it
certainly does not help on your way to a solution for your problem.
Here is code that builds on our state of affairs before we -reshape-d. Note
I am creating a fake return in there, so the variances calculated earlier
have no connection to it. Everything else would require substantial
restructuring of the solution. The code generates three portfolios that
contain "low", "middle" and "high" variance stocks:
***********
//create resultsfile
cap erase myfile.dta
di in red _rc
clear*
gen start=.
gen end=.
gen _stat_1=.
gen stock=.
gen str15 kindofreturn=""
save myfile, replace
//get "3105.dta"
clear*
//8 stocks
set obs 8
gen byte stock=_n
//5 time periods
expand 5
bys stock: gen byte time=_n
gen double exret=rnormal()
gen double msciret=rnormal()
gen double msftret=rnormal()
gen double appret=rnormal()
gen double geret=rnormal()
gen double pgret=rnormal()
gen double jnjret=rnormal()
gen double bpret=rnormal()
save 3105, replace
//-use- "3105"
u 3105, clear
//Return calculation
gen double grexret=ex[_n]/ex[_n-1]-1 if _n>1
gen double grmsciret=msci[_n]/msci[_n-1]-1 if _n>1
gen double grmsftret=msft[_n]/msft[_n-1]-1 if _n>1
gen double grappret=app[_n]/app[_n-1]-1 if _n>1
gen double grgeret=ge[_n]/ge[_n-1]-1 if _n>1
gen double grpgret=pg[_n]/pg[_n-1]-1 if _n>1
gen double grjnjret=jnj[_n]/jnj[_n-1]-1 if _n>1
gen double grbpret=bp[_n]/bp[_n-1]-1 if _n>1
//loop to get -rolling- results for each stock
//and each return
foreach ret in exret msciret msftret{
//start inner loop
su stock, mean
qui forv i=1/`r(max)'{
preserve
keep if stock==`i'
tsset time
rolling r(Var), window(2) clear: su `ret'
gen stock=`i'
gen kindofreturn="`ret'"
append using myfile
save myfile, replace
restore
}
//end inner loop
}
u myfile, clear
ren _stat_1 Variance
sort stock kindofreturn start
la def sto 1 "Firm 1" 2 "Firm 2" 3 "Firm 3" ///
4 "Firm 4" 5 "Firm 5" 6 "Firm 6" 7 "Firm 7" ///
8 "Firm 8"
la val stock sto
//get "low"/"middle"/"high" volatility portfolios
bys start kindofreturn (Variance): gen byte lowvar=_n<=3
bys start kindofreturn (Variance): gen byte middlevar=inlist(_n,4,5)
bys start kindofreturn (Variance): gen byte highvar=_n>5
l, h(30) noo sepby(start kindofreturn lowvar middlevar highvar)
//generate very fake return
gen myreturn=rnormal(.1,.05)
bys start kindofreturn (lowvar): ///
egen averagereturnlow=mean(myreturn) if lowvar
bys start kindofreturn (middlevar): ///
egen averagereturnmiddle=mean(myreturn) if middlevar
bys start kindofreturn (highvar): ///
egen averagereturnhigh=mean(myreturn) if highvar
sort start kindofreturn lowvar middlevar highvar
l start Variance myreturn stock lowvar averagereturn*, ///
noo sepby(start kindofreturn low middle high)
***********
HTH
Martin
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Lars Knuth
Sent: Samstag, 5. Juni 2010 21:16
To: [email protected]
Subject: Re: st: RE: Calculate variances of subsamples
Dear Statalisters,
I want to share the results, maybe there is someone (in the future),
who has the same problem to solve. The input came almost completely
from Martin.
The program takes price data for a large number of stocks, calculates
returns, calculates then variances for every stock individually many
times using a rolling window. The variances are in the right order to
be compared in the cross-section.
This is also the beginning of my next and last problem:
I have all the variances. I now need to compare at every point in time
all the variances for the different stocks (Exxon, Microsoft etc),
rank them with the lowest variance first (at every point in time),
then build 10 portfolios (it will be more than 1000 stocks) where the
first portfolio for example exists of the stocks with the 10% lowest
variances, the second includes the next ten percent etc.
For those portfolios, the return has to be calculated (so the datasets
have to be merged again). Then it can be tested (t-test) whether the
low variances stock portfolio has a statistically significantly lower
return than the high variance portfolio.
As I said, if someone (Martin?) has an idea for that, I would be more
than thankful since in this case I can finish my Stata work for the
moment.
Thanks in advance!
clear*
set more off
*create resultsfile
cap erase resultsfile.dta
di in red _rc
clear*
gen start=.
gen end=.
gen _stat_1=.
gen str15 kindofreturn=""
save resultsfile, replace
use "\3105.dta", clear
gen int time=_n
*Return calculation
foreach price in pricemsci priceex pricemsft priceapp pricege pricepg
pricejnj pricebp {
gen double `price'ret=`price'[_n]/`price'[_n-1]-1 if _n>1
}
renpfix price
*loop to get -rolling- results for each stock
foreach ret in exret msciret msftret appret geret pgret jnjret bpret{
preserve
tsset time
rolling r(Var), window(60) clear: su `ret'
gen kindofreturn="`ret'"
append using resultsfile
save resultsfile, replace
restore
}
u resultsfile, clear
ren _stat_1 Variance
sort kindofreturn start
l, sepby(kindofreturn) noo
bys kindofreturn: gen int time=_n
reshape wide Variance, i(time) j(kindofreturn) string
renpfix Variance
list, noo
2010/6/5 Martin Weiss <[email protected]>:
>
> <>
>
> ***********
> clear*
>
> input Var str16 stock
> 0.00234 exxon
> .05654 exxon
> 0.13444 exxon
> 0.99388 microsoft
> .4342 microsoft
> 0.42445 microsoft
> 0.42444 intel
> 0.32443 intel
> 0.23434 intel
> end
>
> bys stock: gen int time=_n
> reshape wide Var, i(time) j(stock) string
> renpfix Var
> list, noo
> ***********
>
>
> HTH
> Martin
>
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Lars Knuth
> Sent: Samstag, 5. Juni 2010 17:49
> To: [email protected]
> Subject: Re: st: RE: Calculate variances of subsamples
>
> Oh, my explanation was probably irritating. It was just for
> illustration. I have two columns, one with the numbers, the other
> having the strings. What I need to have in a new file is just the
> numbers.
>
> 0.00234(exxon) says that there should be the 0.00234, which depends to
> exxon etc. It were of course nice if the new variable including the
> exxon numbers would be named exxon, the one having the numbers for
> microsoft could be named microsoft etc. However, the important part
> concerns just the numbers.
>
> 2010/6/5 Martin Weiss <[email protected]>:
>>
>> <>
>>
>> Your problem may turn out to be easily solved with a -reshape-. It is not
> a
>> good idea, though, to have "0.00234(exxon)" in a cell of your data, as
> this
>> would have to be stored as a string, precluding any further processing of
>> the number. Did you write it as an illustration, or do you really want
the
>> cell to contain the string?
>>
>>
>> HTH
>> Martin
>>
>>
>> -----Original Message-----
>> From: [email protected]
>> [mailto:[email protected]] On Behalf Of Lars Knuth
>> Sent: Samstag, 5. Juni 2010 17:28
>> To: [email protected]
>> Subject: Re: st: RE: Calculate variances of subsamples
>>
>> Ok, great, it took some time, but I finally understood Martin`s
>> code... this is a great way of learning more about STATA.
>> My next problem is that I have a variable with the variances for 536
>> -rolling- steps for each of the stocks.
>> It looks like this:
>> Variance stock name
>> 0.00234 exxon
>> ........... exxon
>> 0.13444 exxon
>> 0.99388 microsoft
>> ........... microsoft
>> 0.42445 microsoft
>> 0.42444 intel
>> ........ intel
>> 0.23434 intel
>>
>> What I would like to have is the following:
>>
>> 0.00234(exxon) 0.99388(microsoft) 0.42444(intel)
>> ........... ............
> ............
>> 0.13444 0.42445 0.23434(intel)
>>
>> I could do gen varexxon=Variance if stockname=="exxon"
>> and that for all the stocks. But even if I do so I get variables with
>> a lot of missings and I can not write the variances horizontally next
>> to each other.
>>
>> But they are from the same time (because of -rolling-) and I need them
>> to be horizontally ordered without the missings.
>>
>> I hope my problem becomes clear. I guess what I miss is just a small
>> command.
>> Thank you in advance for any hint!
>>
>> 2010/6/2 Martin Weiss <[email protected]>:
>>>
>>> <>
>>>
>>> You could of course issue the -rolling- call with -clear- present,
-save-
>>> the result to a new file and reload your "3105.dta" to start anew for
the
>>> next stock. The datasets thus -saved- could be -append-ed to form one
big
>>> dataset afterwards. -postfile- is also an option, as always.
>>>
>>> BTW, you may be better of with the lag operator "L." for your return
>>> calculations.
>>>
>>>
>>> HTH
>>> Martin
>>>
>>> -----Original Message-----
>>> From: [email protected]
>>> [mailto:[email protected]] On Behalf Of Lars Knuth
>>> Sent: Mittwoch, 2. Juni 2010 20:22
>>> To: statalist
>>> Subject: st: Calculate variances of subsamples
>>>
>>> Dear listers,
>>>
>>> I have to say thanks to Martin, the recommendation of rolling was
>>> great. Unfortunately, I have now a few problems with the
>>> implementation.
>>> 1. -rolling- works with the "clear" option, but without it does not
>>> ("rolling r(Var), window(60) clear: summarize exret" works)
>>> 2. I need the data to calculate and store the variances for more than
>>> 1000 stock price returns in the end, so can I somehow keep all the
>>> data and then perform -rolling- in a loop?
>>> 3. Is there also an opportunity to perform the return calculation in a
>> loop?
>>>
>>> I am attaching parts of the code I have so far. Any ideas would be of
>>> great help to me.
>>> Thanks in advance!
>>>
>>> clear*
>>> use "C:\...\3105.dta", clear
>>>
>>> gen int time=_n
>>> * Return calculation
>>> gen double exret=ex[_n]/ex[_n-1]-1 if _n>1
>>> gen double msciret=msci[_n]/msci[_n-1]-1 if _n>1
>>> gen double msftret=msft[_n]/msft[_n-1]-1 if _n>1
>>> gen double appret=app[_n]/app[_n-1]-1 if _n>1
>>> gen double geret=ge[_n]/ge[_n-1]-1 if _n>1
>>> gen double pgret=pg[_n]/pg[_n-1]-1 if _n>1
>>> gen double jnjret=jnj[_n]/jnj[_n-1]-1 if _n>1
>>> gen double bpret=bp[_n]/bp[_n-1]-1 if _n>1
>>>
>>> tsset time
>>>
>>> * Rolling
>>> rolling r(Var), window(60): summarize exret
>>> rolling r(Var), window(60): summarize msciret
>>> rolling r(Var), window(60): summarize msftret
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/statalist/faq
>>> * http://www.ats.ucla.edu/stat/stata/
>>>
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/statalist/faq
>>> * http://www.ats.ucla.edu/stat/stata/
>>>
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/