Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Create Timeline based on Dates
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Create Timeline based on Dates
Date
Mon, 29 Oct 2012 18:30:53 +0000
Thanks for the closure!
On Mon, Oct 29, 2012 at 5:58 PM, Lisa Wang <[email protected]> wrote:
> Dear Nick,
>
> I did a combination of the two suggestions (dow and then also the
> forward/backward time differences) and it worked brilliantly. Exactly
> as I wanted!!
>
> Thank you so much for your time and effort. You are my rescuer!
>
> Best regards,
> Lisa
>
> On Fri, Oct 26, 2012 at 8:18 PM, Nick Cox <[email protected]> wrote:
>> On weekends:
>>
>> It seems that you have Mondays to Fridays only in your data, and you
>> want time to run in sequence so that Monday follows Friday.
>>
>> The -dow()- function returns 1 to 5 for Mondays to Fridays. A bit of
>> messing around yields a mapping to sequential "dates" that omit
>> Saturdays and Sundays. (If you want another origin, you can just shift
>> it.)
>>
>> Again, I need a sandpit.
>>
>> clear
>> set obs 14
>> gen date = mdy(10, 20, 2012) + _n
>> format date %td
>> gen dow = dow(date)
>>
>> . l
>>
>> +-----------------+
>> | date dow |
>> |-----------------|
>> 1. | 21oct2012 0 |
>> 2. | 22oct2012 1 |
>> 3. | 23oct2012 2 |
>> 4. | 24oct2012 3 |
>> 5. | 25oct2012 4 |
>> |-----------------|
>> 6. | 26oct2012 5 |
>> 7. | 27oct2012 6 |
>> 8. | 28oct2012 0 |
>> 9. | 29oct2012 1 |
>> 10. | 30oct2012 2 |
>> |-----------------|
>> 11. | 31oct2012 3 |
>> 12. | 01nov2012 4 |
>> 13. | 02nov2012 5 |
>> 14. | 03nov2012 6 |
>> +-----------------+
>>
>> gen seqdate = (5 * (date - dow(date) - 2) / 7) + dow(date)
>> replace seqdate = . if inlist(dow(date), 0, 6)
>>
>> . l
>>
>> +---------------------------+
>> | date dow seqdate |
>> |---------------------------|
>> 1. | 21oct2012 0 . |
>> 2. | 22oct2012 1 13776 |
>> 3. | 23oct2012 2 13777 |
>> 4. | 24oct2012 3 13778 |
>> 5. | 25oct2012 4 13779 |
>> |---------------------------|
>> 6. | 26oct2012 5 13780 |
>> 7. | 27oct2012 6 . |
>> 8. | 28oct2012 0 . |
>> 9. | 29oct2012 1 13781 |
>> 10. | 30oct2012 2 13782 |
>> |---------------------------|
>> 11. | 31oct2012 3 13783 |
>> 12. | 01nov2012 4 13784 |
>> 13. | 02nov2012 5 13785 |
>> 14. | 03nov2012 6 . |
>>
>> If you are now going to tell me that the schools have
>> {holidays|vacations} too, then you really need a business calendar.
>>
>>
>> On Fri, Oct 26, 2012 at 9:36 AM, Nick Cox <[email protected]> wrote:
>>> For problems like this I need a sandpit to play in. Here is one I made:
>>>
>>> list, sepby(id)
>>>
>>> +-------------------+
>>> | id date event |
>>> |-------------------|
>>> 1. | 1 13 0 |
>>> 2. | 1 13 0 |
>>> 3. | 1 14 1 |
>>> 4. | 1 15 0 |
>>> 5. | 1 17 0 |
>>> 6. | 1 18 1 |
>>> 7. | 1 19 0 |
>>> |-------------------|
>>> 8. | 2 14 0 |
>>> 9. | 2 15 0 |
>>> 10. | 2 15 1 |
>>> 11. | 2 17 0 |
>>> 12. | 2 18 0 |
>>> 13. | 2 19 1 |
>>> 14. | 2 20 0 |
>>> +-------------------+
>>>
>>> I see this as follows.
>>>
>>> 1. There is a date looking forward, which is (present date - previous
>>> event date), and is thus zero or positive
>>>
>>> 1'. There is a twist on 1: There can be multiple observations with the
>>> same date.
>>>
>>> 2. There is a date looking backward which is (present date - next
>>> event date), and is thus zero or negative
>>>
>>> 2'. As 1'.
>>>
>>> 3. The wanted date is the smaller in absolute value. If there is a tie
>>> in absolute value, I choose the positive value.
>>>
>>> 4. For dates before the first event, no previous date can be
>>> identified. But this is not a problem, as the backward date will be
>>> the solution for these dates.
>>>
>>> 4. For dates after the last event, no next date can be identified. But
>>> this is not a problem, as the forward date will be the solution for
>>> these dates.
>>>
>>> To get "forward dates", we just copy previous values as needed, after
>>> spreading each event to all dates that are the same:
>>>
>>> gen prev = date if event == 1
>>> bysort id date (prev) : replace prev = prev[1] if prev[1] == 1
>>> bysort id (date) : replace prev = prev[_n-1] if mi(prev)
>>> gen forward = date - prev
>>>
>>> To get "backward dates", we can use the trick of reversing time.
>>>
>>> gen negdate = -date
>>> gen next = date if event == 1
>>> bysort id negdate (next) : replace next = next[1] if next[1] == 1
>>> bysort id (negdate) : replace next = next[_n-1] if mi(next)
>>> gen backward = date - next
>>>
>>> Now can we do the comparison:
>>> .
>>> gen timeline = cond(abs(forward) <= abs(backward), forward, backward)
>>> sort id date
>>>
>>> list id date forw backw timeline, sepby(id)
>>>
>>> +-------------------------------------------+
>>> | id date forward backward timeline |
>>> |-------------------------------------------|
>>> 1. | 1 13 . -1 -1 |
>>> 2. | 1 13 . -1 -1 |
>>> 3. | 1 14 0 0 0 |
>>> 4. | 1 15 1 -3 1 |
>>> 5. | 1 17 3 -1 -1 |
>>> 6. | 1 18 0 0 0 |
>>> 7. | 1 19 1 . 1 |
>>> |-------------------------------------------|
>>> 8. | 2 14 . -1 -1 |
>>> 9. | 2 15 0 0 0 |
>>> 10. | 2 15 0 0 0 |
>>> 11. | 2 17 2 -2 2 |
>>> 12. | 2 18 3 -1 -1 |
>>> 13. | 2 19 0 0 0 |
>>> 14. | 2 20 1 . 1 |
>>> +-------------------------------------------+
>>>
>>> For the complication with weekends, Stata offers business calendars as
>>> a complete solution. I have never used them.
>>>
>>> On Fri, Oct 26, 2012 at 1:02 AM, Lisa Wang <[email protected]> wrote:
>>>
>>>> I would like to create a timeline based on some event date (ie. ...-5,
>>>> -4, -3, -2, -1, 0, +1, +2, +3...etc). I have different students names
>>>> in a variable named "as" (column 1) and also a set of dates (column
>>>> 2) as well as another variable 'edate' (column 3) which has the event
>>>> dates and . everywhere else if it didn't match with column 2. What I
>>>> would like to know is how to create the timeline with the event date
>>>> being 0 for each student.
>>>>
>>>> This is the code I have run so far:
>>>>
>>>> - bysort as: generate rank =_n
>>>>
>>>> . bysort as: generate erank = rank if date==edate
>>>>
>>>> . bysort as: egen erank_pop = min(erank)
>>>>
>>>> . bysort as: generate t = rank -erank_pop -
>>>>
>>>> There are three problems which have me now stuck.
>>>>
>>>> 1. I might have multiple observations for a particular student on the
>>>> same date as well. Therefore, when I run the first line of code, it's
>>>> already erroneous as Stata will treat it as being different dates. I
>>>> tried also -bysort as(date): generate rank =_n - instead but it
>>>> returns an error: "factor variables and time-series operators not
>>>> allowed".
>>>>
>>>> 2. Sometimes I have multiple event dates for a particular student - I
>>>> would like Stata to guess which event date the date is closer to and
>>>> then do the time differences from that.
>>>>
>>>> 3. The dates in column 2 have all weekdays but no weekends (as the
>>>> students don't need to go to school on those days), so if I do a
>>>> timeline then it will skip some dates (eg. -5 then to -2,-1 etc. as a
>>>> result of the weekend). How would I overcome this, so that it actually
>>>> is -3,-2,-1 etc?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/