Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Restructuring the time dimension in a dataset
From
Maarten Buis <[email protected]>
To
[email protected]
Subject
Re: st: Restructuring the time dimension in a dataset
Date
Fri, 11 Oct 2013 21:03:59 +0200
Question 1: -help stsplit-
Question 2: that depends on so many things....
Hope this helps,
Maarten
On Fri, Oct 11, 2013 at 8:47 PM, Tunga Kantarcı <[email protected]> wrote:
> Hello,
>
> I have a dataset where ‘variable one’ indicates a unique
> identification number for each individual in the data. Then there is
> ‘variable two’ which indicates a date (like 01-01-2010) which is the
> start date of a period and ‘variable three’ indicates a date (like
> 05-01-2010) which is the end date of the same period. Then there is
> ‘variable four’ which indicates a number between 0 and 1 (like 0.574)
> that has been realised during the period 01-01-2010 - 05-01-2010.
>
> A snapshot of the data sheet for individual 4115111 looks like this:
>
> 4115111 01-01-2010 05-01-2010 0.574
> 4115111 05-01-2010 31-09-2011 0.321
>
> In this dataset, as the snapshot also shows, the length of a period is
> irregular. It can be as short as a day (like 01-01-2010 – 02-01-2010)
> or as long as a year (like 01-01-2010 - 01-01-2011), or even longer.
> Hence it is not clear how I should treat the time dimension of the
> data. The cases of variable four are not observed on a monthly or
> yearly basis. I plan to restructure the data. That is, I plan to
> fragment each period into multiple periods with a length of one day
> and then aggregate them to, say, a month. This means that the first
> period, which is
>
> 4115111 01-01-2010 05-01-2010 0.574,
>
> would be fragmented into
>
> 4115111 01-01-2010 02-01-2010 0.574
> 4115111 02-01-2010 03-01-2010 0.574
> 4115111 03-01-2010 04-01-2010 0.574
> 4115111 04-01-2010 05-01-2010 0.574,
>
> and the second period, which is
>
> 4115111 05-01-2010 31-09-2011 0.321,
>
> would be fragmented into
>
> 4115111 05-01-2010 06-01-2010 0.321
> .
> .
> 4115111 30-09-2011 31-09-2011 0.321.
>
> After this fragmentation, I plan to collapse the daily series to
> monthly series which would mean that variable four will be averaged
> over the days of a month to make up a monthly number, perhaps using
> the “collapse variable four, by(variable two)” command. In the end I
> would like to have monthly data.
>
> Given this explanation, I would like to ask two questions.
>
> Question one: In Stata, how can I fragment each case (that is each row
> in the data) into multiple cases (multiple rows) with respect to
> variable two and variable three as explained above?
>
> Question two: If it was your own data, how would you treat it? Would
> your approach be the same as mine?
>
> Tunga
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
--
---------------------------------
Maarten L. Buis
WZB
Reichpietschufer 50
10785 Berlin
Germany
http://www.maartenbuis.nl
---------------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/