Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Relative Comparision between Observations
From
[email protected]
To
[email protected]
Subject
Re: st: Relative Comparision between Observations
Date
Thu, 25 Aug 2011 16:43:34 +0200
Hi Nick,
thanks a lot.
The dataset contains 500 000 transactions (in addition to the 7 million spreads), but I will use your approach as a starting point for an algorithm that allows to cope with this large dataset.
Any suggestion to get this done quickly is still very welcome.
Best regards and thanks again,
Jens
-------- Original-Nachricht --------
> Datum: Thu, 25 Aug 2011 15:20:58 +0100
> Von: Nick Cox <[email protected]>
> An: [email protected]
> Betreff: Re: st: Relative Comparision between Observations
> For -transaction[2]- (e.g.) you can generate
>
> . gen within_2 = inrange(transaction[2], start, end) & isspread
>
> Is the number of transactions small enough to allow a variable for
> every one of them?
>
> If so, this is crude but should work
>
> forval i = 1/`=_N' {
> if isspread[`i'] == 0 gen within_`i' = inrange(transaction[`i'],
> start, end) & isspread
> }
>
> A visceral reaction is that getting the wrong data structure is
> horribly easy here, but people who work with this kind of data may be
> able to advise constructively.
>
> Nick
>
> On Thu, Aug 25, 2011 at 2:55 PM, Jens Kruk <[email protected]> wrote:
> > Hi Nick,
> > lets say the data looks like this:
> >
> > id____isspread____start____end____transaction
> > 1_____1___________3________6______.
> > 2_____0___________.________.______5
> > 3_____1___________2________5______.
> > 4_____0___________.________.______5.5
> >
> >
> >
> > now what I want Stata to do is to tell me (for example by creating
> additional variables that contain the ids) that ids 2 and 4 occured between
> start and end date of observation 1 (5 and 5.5 are between 3 and 6) and that id
> 2 occured between the start and end date of spread 3 (5 is weakly between
> 2 and 5).
> > A perfect result of the procedure would look like this:
> >
> > id____isspread____start____end____transaction____tr1___tr2
> > 1_____1___________3________6______.______________2_____4__
> > 2_____0___________.________.______5______________._____.__
> > 3_____1___________2________5______.______________2_____.__
> > 4_____0___________.________.______5.5____________._____.__
> >
> >
> > Best, Jens
> >
> >
> >
> >
> > -------- Original-Nachricht --------
> >> Datum: Thu, 25 Aug 2011 14:22:19 +0100
> >> Von: Nick Cox <[email protected]>
> >> An: [email protected]
> >> Betreff: Re: st: Relative Comparision between Observations
> >
> >> Please show a representative chunk of your data so that precisely what
> >> are your variables and your observations becomes clear.
> >>
> >> Nick
> >>
> >> On Thu, Aug 25, 2011 at 2:09 PM, <[email protected]> wrote:
> >>
> >> > I want to perform the following task for a very large dataset (so
> >> writing a Mata loop is probably not the solution): the dataset consists
> of two
> >> sorts of data: spreads and transactions. Spreads do have a start and an
> end
> >> date, while transactions only have a transaction date. Now I want to
> know
> >> whether some transaction happend between the start and end date of a
> spread.
> >> Ideally, I would like to have variables containing all the ids of
> >> transactions that occured between the start and end data of the spread
> for each
> >> spread. Is there a way to use inexact matching or merging for this ?
> >> > This should be a familiar problem, however, I do not have a clue how
> to
> >> solve it.
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
--
Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir
belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/