Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: generating annualized standard deviation of returns from monthly data.
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: RE: generating annualized standard deviation of returns from monthly data.
Date
Thu, 27 Feb 2014 17:13:21 +0000
You are using the name -year- but that name is wildly misleading here.
The values of year are, it seems, individual daily dates.
The key point of -collapse, by(firm year)- is how many observations
there are for each _distinct_ combination of -firm- and -year-. For
your sample data shown here all the groups are represented by _single_
observations, with the result explained earlier, the SD is returned as
missing (because sample size - 1 is 0).
You have to produce a true "year" variable for what you want to work,
e.g. by using -yofd()-.
Nick
[email protected]
On 27 February 2014 16:56, Ikechukwu M. <[email protected]> wrote:
> Thank you.
>
> here is what I get when I perform either of the two commands.
>
> I agree that without the year grouping variable there should be one sd
> returned per firm. It is including the year grouping variable that
> messes things up.
>
>
> year tic return sd_return
> 78. 31jan2000 0183B -10.71428571 .
> 79. 29feb2000 0183B 48 .
> 80. 31mar2000 0183B -29.72972973 .
> ------------------------------------------------
> 81. 30apr2000 0183B 7.692307692 .
> 82. 31may2000 0183B -17.85714286 .
> 83. 30jun2000 0183B 39.13043478 .
> 84. 31jul2000 0183B -18.75 .
> 85. 31aug2000 0183B 61.53846154 .
> ------------------------------------------------
> 86. 30sep2000 0183B -33.33333333 .
> 87. 31oct2000 0183B 14.28571429 .
> 88. 30nov2000 0183B -18.75 .
> 89. 31dec2000 0183B -7.692307692 .
> 90. 31jan2001 0183B 37.5 .
> ------------------------------------------------
> 91. 28feb2001 0183B -27.27272727 .
> 92. 31mar2001 0183B 50 .
> 93. 30apr2001 0183B -18.22222222 .
> 94. 31may2001 0183B 25 .
> 95. 30jun2001 0183B -6.086956522 .
> ------------------------------------------------
> 96. 31jul2001 0183B -20.83333333 .
> 97. 31aug2001 0183B 2.339181287 .
> 98. 30sep2001 0183B -22.85714286 .
> 99. 31oct2001 0183B 39.25925926 .
> 100. 30nov2001 0183B -20.21276596 .
> ------------------------------------------------
> 101. 31dec2001 0183B -.6666666667 .
> 102. 31jan2002 0183B 9.395973154 .
> 103. 28feb2002 0183B 0 .
> 104. 31jan2000 0223B 0 .
> 105. 29feb2000 0223B 5.551515152 .
> ------------------------------------------------
> 106. 31mar2000 0223B 1.447178003 .
> 107. 30apr2000 0223B .4279600571 .
> 108. 31may2000 0223B 0 .
> 109. 31jan2000 0226B 0 .
> 110. 29feb2000 0226B 0 .
> ------------------------------------------------
> 111. 31mar2000 0226B 0 .
> 112. 30apr2000 0226B 0 .
> 113. 31may2000 0226B 800 .
> 114. 30jun2000 0226B -33.33333333 .
> 115. 31jul2000 0226B 0 .
> ------------------------------------------------
> 116. 31aug2000 0226B 0 .
>
>
> This result is obtained from bysort firm year: egen SD=sd(return)
>
> Thanks again.
>
> IK
>
> On Thu, Feb 27, 2014 at 10:47 AM, Nick Cox <[email protected]> wrote:
>> If you don't specify the year as a grouping variable, then values for
>> different years are lumped together; that is precisely as it should
>> be.
>>
>> Otherwise, I can't make sense of the claim that you get missing for SD
>> with (e.g.) 6 non-missing values. -collapse- produces a missing SD if
>> all values (or all but one) values are missing in a group, but not
>> otherwise. (The "all but one" follows from the use of (n - 1) rather
>> than n in the formula for SD, n being sample size as usual.)
>>
>> If you were expecting that missing values would be omitted from the
>> -collapse- results, that expectation was incorrect.
>>
>> To make clear your perceived problem, we need to see data and output,
>> e.g. for examples like that below.
>>
>> . clear
>>
>> . input firm year return
>>
>> firm year return
>> 1. 1 2000 0.875
>> 2. 1 2000 1.2
>> 3. 1 2000 0.9
>> 4. 1 2000 0.35
>> 5. 1 2000 0.98
>> 6. 1 2000 1.4
>> 7. 1 2000 .
>> 8. 1 2000 .
>> 9. 1 2000 .
>> 10. 1 2000 .
>> 11. 1 2000 .
>> 12. 1 2000 .
>> 13. 1 2001 .
>> 14. 1 2001 .
>> 15. end
>>
>> . collapse (sd) return, by(firm year)
>>
>> . list
>>
>> +------------------------+
>> | firm year return |
>> |------------------------|
>> 1. | 1 2000 .3560957 |
>> 2. | 1 2001 . |
>> +------------------------+
>>
>> Nick
>> [email protected]
>>
>>
>> On 27 February 2014 15:28, Ikechukwu M. <[email protected]> wrote:
>>> Thanks. Apologies for incorrect attribution to Nick Cox. What I meant
>>> to say is that occurrence of missing values collapses to a missing,
>>> even though I expected the missings to be ignored.
>>> Thanks for the input - I have implemented what you both suggest and
>>> the good news is that it resolves to the same thing so it is working
>>> but not producing the desired output. I am ending up with missing
>>> values even for firms that have 6 monthly observations for the year.
>>>
>>> The collapse code I used is this:
>>> collapse (sd) sd_return=return, by(firm year)
>>>
>>> using bysort firm year: egen SD=sd(return)
>>>
>>> but when I omit the year, sd is appropriately computed but for all 10
>>> years of the data, not partitioned into years.
>>>
>>> When I include the year, I end up with lots of missing observations.
>>>
>>> Thanks
>>>
>>> On Thu, Feb 27, 2014 at 4:21 AM, Nick Cox <[email protected]> wrote:
>>>> There are various "Nick"s around here. In my case, I wouldn't offer
>>>> the explanation that the occurrence of missings will imply zero
>>>> standard deviations with -collapse-, because it isn't true. More
>>>> importantly, as you don't give the -collapse- code you used, we are
>>>> reduced to speculation that somehow your -collapse- produced a
>>>> collapse to constants, which have 0 SD.
>>>> Nick
>>>> [email protected]
>>>>
>>>>
>>>> On 27 February 2014 05:53, Ikechukwu M. <[email protected]> wrote:
>>>>> Thanks Kieran for your response. I tried that and it gives me all
>>>>> zeros. I think it has to do with how stata treats missing values in
>>>>> the collapse command. I had seen an earlier post by Nick regarding
>>>>> this.
>>>>>
>>>>> I used bys firm : egen sd=sd(return) and I get values but they are not
>>>>> partitioned by year. It gives me one SD for all the datapoints for the
>>>>> firm.
>>>>>
>>>>> thanks
>>>>>
>>>>> On Wed, Feb 26, 2014 at 11:23 PM, Kieran McCaul
>>>>> <[email protected]> wrote:
>>>>>> ...
>>>>>>
>>>>>> Like this?
>>>>>>
>>>>>> clear *
>>>>>>
>>>>>> input firm str7 date return
>>>>>> 1 "Jan2000" 0.875
>>>>>> 1 "Feb2000" 1.2
>>>>>> 1 "Mar2000" 0.9
>>>>>> 1 "Jan2001" 0.35
>>>>>> 1 "Feb2001" 0.98
>>>>>> 2 "Jan2000" 1.4
>>>>>> 2 "Feb2000" .76
>>>>>> 2 "Mar2000" 1.34
>>>>>> end
>>>>>>
>>>>>> gen year = substr(date, 4,.)
>>>>>>
>>>>>> preserve
>>>>>>
>>>>>> collapse (sd) sd_return=return, by(firm year)
>>>>>> tempfile ttt
>>>>>> save `ttt', replace
>>>>>>
>>>>>> restore
>>>>>>
>>>>>> merge m:1 firm year using `ttt'
>>>>>> list
>>>>>> bysort firm year: summ return
>>
>>>>>> From: [email protected] [mailto:[email protected]] On Behalf Of Ikechukwu M.
>>>>>> Sent: Thursday, 27 February 2014 9:33 AM
>>>>>> To: [email protected]
>>>>>> Subject: st: generating annualized standard deviation of returns from monthly data.
>>>>>>
>>>>>> I am trying to compute standard deviation of returns for a panel data set and I am having a little difficulty.
>>>>>>
>>>>>> My data looks like this
>>>>>>
>>>>>> Firm date return
>>>>>> 1 Jan2000 0.875
>>>>>> 1 Feb2000 1.2
>>>>>> 1 Mar2000 0.9
>>>>>> 1 Jan2001 0.35
>>>>>> 1 Feb2001 0.98
>>>>>> 2 Jan2000 1.4
>>>>>> 2 Feb2000 .76
>>>>>> 2 Mar2000 1.34
>>>>>>
>>>>>>
>>>>>> I would like to compute the annualized standard deviation of returns for each firm and return one number for each firm in each year.
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/