Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Count number of observations in date range


From   "Austin Nichols" <[email protected]>
To   [email protected]
Subject   Re: st: Count number of observations in date range
Date   Thu, 19 Apr 2007 15:49:00 -0400

Sebastian--
I also assume one event per obs. The line

replace in_range = ((start_date >= global_start & start_date
<=global_end) | (end_date >= global_start & end_dat<=global_end))
&event!="`el'"

seems to me to ask if the start date and/or end date of each window is
in the window of the current event (`el'), with endpoints global*.
This identifies windows that partially overlap the current event's
window, and windows contained in the current event's window, but not
windows that contain the current event's window.  Right?

Also, the code

      replace global_start = start_date if event == "`el'"
      replace global_end   = end_date if event == "`el'"
      bys global_start: replace global_start = global_start[1]
      by global_start : replace global_end = global_end[1]

is inefficient, in the sense that you only want to pick up two scalars
from the current event (start and end dates).

If I'm wrong, please correct me.  And I'm fairly certain there's a
better solution than the one I proposed...

On 4/19/07, Sebastian F. B�chte <[email protected]> wrote:
Austin,

it would be very helpful for me if you could point out in more detail
where my code fails to identify an overlap. I admit that I somehow
assume that the entries in the "event" variable uniquely identify
observations. If this was not the case I would miss overlaps. But,
under this assumption I cannot figure out which overlaps I would miss.

Regards
Sebastian

On 4/19/07, Austin Nichols <[email protected]> wrote:
> Jarl Svartsj� --
> I don't think that Sebastian's code deals with all overlaps.
> Another approach would be to loop over observations, which I outline below.
>
> clear
> input event start end
>        1  16081 16108
>        2  16081 16109
>        3  16084 16092
>        4  16091 16143
>        5  16100 16105
>        6  16109 16143
>        7  16110 16110
> end
> format st end %d
> li , noo clean
> g long nsimul=.
> forv i=1/`=_N' {
>  loc c "inrange(st,st[`i'],end[`i'])"
>  loc c "`c'|inrange(end,st[`i'],end[`i'])"
>  loc c "`c'|inrange(st[`i'],st,end)"
>  loc c "`c'|inrange(end[`i'],st,end)"
>  g byte s=`c'
>  g long sum=sum(s)
>  replace nsimul=sum[_N] in `i'
>  drop s sum
> }
> g long others=nsimul-1
> compress
> li , noo clean
>
> The conditions for overlap with the current obs are collected in a
> local `c' and
>  g byte s=`c'
> determines which windows overlap with the current obs, but all that
> could be in one -gen- command (which might break awkwardly across
> lines in an email).
>
> On 4/19/07, Sebastian F. B�chte <[email protected]> wrote:
> > I do not think there exist a _simple_ Stata command which would help
> > you achieving what you are asking for. But you could try to program it
> > yourself. I prepared some example code to show you how I would do it.
> > And as always I would expect that the code can be improved or maybe
> > some expert knows a Stata command that can do the trick.
> >
> > On 4/19/07, Jarl Svartsj� <[email protected]> wrote:
> > > I have two date variables in my dataset, which define
> > > the start and end dates for a certain event. For each
> > > observation in the dataset, I'd like to count the
> > > number of other observations in the dataset that
> > > overlap with this event window.
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index