Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: How to aggregate daily data in Stata
From
"Christopher A. Hartwell" <[email protected]>
To
<[email protected]>
Subject
RE: st: How to aggregate daily data in Stata
Date
Mon, 17 Dec 2012 11:28:42 +0400
Thanks Nick, that looks like it will be useful - it helped on my constituent
data sheets in collapsing the data, so many thanks for your assistance.
This unfortunately leads to a second question regarding Stata's use of dates
and how it imports dates from Excel - I'm encountering a problem getting the
data in correctly into Stata.
As before, I have a large unbalanced panel dataset of monthly data, with not
every year/month represented (i.e. Belarus starts at May 1998, Czech
Republic starts in May 1994, etc.). Using Stata 11IC and stat-transferring
from Excel, I get the problem of the monthly date coming in as a string
variable or as a "long" d/m/y variable. The problems, in their order:
1. When originally imported via stat-transfer, the date comes in
appearing normal (it shows in data editor as a d/m/y variable, the first of
each month, although, not the just month-year that I want). However, when I
tsset it as a monthly variable, it appears to have Islamic timing. For
example, Belarus from May 1998 to January 1999 translates as:
Country Month
Belarus 3126m9
Belarus 3129m4
Belarus 3131m10
Belarus 3134m5
Belarus 3136m12
Belarus 3139m6
Belarus 3142m1
Belarus 3144m7
Belarus 3147m2
If I keep it as "daily" data, I can't use any operators such as lags or
differences, because stata thinks the previous period is missing.
2. Alternately, if I attempt to just paste the data into the data
editor, the date shows as a string variable. I try . generate
Date=date(month,"MY") on the month variable, and it generates all missing.
The same thing happens if I
. generate Date=date(month,"DMY")
. gen dm = mofd(Date)
All are blank.
Finally, if I encode month, gen(Month) off of this, I end up with a series
that LOOKS correct - May-98, Jun-98, but when I click on the cell in data
editor, it shows as a number that is non-sequential. That is, May 98 is 214,
June-98 is 166, Jul-98 is 142... and if I tsset on this variable, it
a) says it has tsset "Month, 1960m2 to 1983m12, but with gaps"
b) sorts the months all together and then the years, so it goes Apr-00,
Apr-01, Apr-02... then Aug-00, Aug-01... for each country
Can anyone help with this? Is this an excel problem or a stata one? I just
need to get the monthly/yearly data imported correctly, or else I can't
really do anything.
Christopher A. Hartwell, PhD
Senior Research Fellow
Institute for Emerging Market Studies (IEMS)
http://iems.skolkovo.ru/
In Moscow: +7 916 777 1260 (m)
In the US: +1 202-415-6030 (m)
+1 773-724-2310 (t/f)
Skype: chartwel
[email protected]
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Nick Cox
Sent: Sunday, December 16, 2012 2:15 AM
To: [email protected]
Subject: Re: st: How to aggregate daily data in Stata
. help collapse
Nick
On 15 Dec 2012, at 20:46, "Christopher A. Hartwell"
<[email protected] > wrote:
>
> Have a question that should be easy to figure out for the experts, but
> I can't wrap my mind around it conceptually.
>
> I have daily stock market data for 28 countries over a 20-odd year
> period.
> Some countries have 6 years of data, some have 20 years, etc.,
> depending on when they started their stock markets. I want to
> calculate a volatility metric by using the sum of squares of daily
> changes for each month (that is, aggregate the daily squared changes
> for each month). Given that the panel data has different time lengths
> and there are different monthly periods (i.e. some months have 30
> days, some countries have holidays in the month so there's no data),
> how can I use stata to sum the squares of the daily changes by month?
>
> For example, I have
>
> Belarus March-1-05 1.37
> Belarus March-2-05 0.69
> .
> .
> .
> Belarus March 31-05 17.33
>
> And I want to generate one variable per country per month that is the
> sum of these numbers, so that I have
>
> Belarus Mar-05 37.20
> Belarus Apr-05 18.99
>
> Etc.
>
> Is there an easy "by x:" command that will let me do it? Or some
> iteration that will allow for this?
>
> The problem I see is that I need to sum the data first by day over a
> specific month, for each month, for each country, on a dataset that is
> severely unbalanced.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/