Nick and Sergio, Thanks.
Le
On 12/15/06, Nick Cox <[email protected]> wrote:
> Thanks.
>
> Actually, there's at least one bug.
>
> Consider
>
> by id : gen indicator = four[_n - 1] > 0 if outcome == 2
>
> and what happens at the first member of each panel, for
> which _n == 1. Thus _n - 1 == 0. -four[0]- will always
> treated as missing by Stata, and thus is > 0, which
> will be relevant if the first -outcome == 2-.
>
> Well, if it's the first member of the panel, evidently
> we know nothing about anything previous. So the code should
> be
>
> by id : gen indicator = (four[_n - 1] > 0) if outcome == 2 & _n > 1
>
> to ensure that the indicator is always missing for the first
> member.
>
> I parenthesised
>
> (four[_n-1] > 0)
>
> to underline that it's the expression that counts here; that is,
> Stata will evaluate this as 1 or 0 to get the indicator desired,
> although only if the -if- condition is satisfied.
>
> Nick
> [email protected]
>
> Sergio Correia
>
> > Interesting answer. Much more bug-free.
> >
> > By the way, adding -bys id (eventid) : - to each line makes the code
> > panel-ready so it's not that much of a problem.
>
> > On 12/15/06, Nick Cox <[email protected]> wrote:
>
> > > I haven't tried understanding Sergio's code, as it
> > > appears to take no account of the fact that this is panel data
> > > and so calculations must be done separately for each
> > > identifier.
> > >
> > > Consider starting with -id-, -eventid- and -outcome-. Then
> > >
> > > bysort id (eventid) : gen order = sum(outcome == 2) * (outcome == 2)
> > >
> > > or
> > >
> > > bysort id (eventid) : gen order = cond(outcome == 2,
> > sum(outcome == 2), 0)
> > >
> > > -outcome == 2- evaluates as 1 or 0 depending on whether it is true
> > > or false, and the -sum()- gives you the cumulative sum.
> > >
> > > Furthermore each occurrence of -outcome == 2- defines a new
> > > "spell":
> > >
> > > by id : gen spell = sum(outcome == 2)
> > >
> > > bysort id spell (eventid) : gen four = sum(outcome == 4)
> > > by id spell : replace four = four[_N]
> > >
> > > by id : gen indicator = four[_n - 1] > 0 if outcome == 2
> > >
> > > The "spell" point of view is simple but gives you a handle
> > > on many problems in this territory. It is really piggy-backing
> > > on -by:-. For spell incantations, see -tsspell- on SSC and
> > > its quite detailed help file. For -by:- explanations, -search
> > > by- and follow up manual and Stata Journal references.
> > >
> > > As Sergio says, -egen- is another route here but a path from
> > > first principles is always instructive.
>
> Sergio Correia
>
> > > > This works but I'm pretty sure its not the best way:
> > > >
> > > > * CODE:
> > > > gen xyz = outcome==4
> > > > replace xyz = (xyz[_n-1]==1 | xyz ==1) & (outcome!=2)
> > > > gen indicator = xyz[_n-1]==1 if order>1
> > > >
> > > >
> > > > Line 1 is straightforward.
> > > > Line 2 is 1 if there has been an outcome of 4 since the
> > last success
> > > > (and we are not on a successful outcome)
> > > > Line 3 is also simple
> > > >
> > > > The last two lines can be merged but that would make them
> > harder to
> > > > understand. Again, I'm sure there are better answers
> > (maybe egen, sum
> > > > or more complex gens).
> > >
> > > Le Wang
> > >
> > > > > I have a data set containing four variables
> > > > >
> > > > > (1) household id (2) event id (3) event outcome (4)
> > order of success
> > > > >
> > > > > event outcomes can takes on values of 1,2,3,4; if the event
> > > > outcome is
> > > > > 2, it is successful and ordered according to the timing
> > of occurance
> > > > > of the success (recorded in the fourth variable "order
> > of success").
> > > > > The data looks like what follows,
> > > > >
> > > > >
> > > > --------------------------------------------------------------
> > > > ---------------------------
> > > > > id eventid outcome order of success
> > > > > 1 1 1 0
> > > > > 1 2 2 1
> > > > > 1 3 4 0
> > > > > 1 4 2 2
> > > > > 2 1 2 1
> > > > > 2 2 4 0
> > > > > 2 3 4 0
> > > > > 2 4 3 0
> > > > > 2 5 2 2
> > > > > 3 1 2 1
> > > > > 3 2 2 2
> > > > > 3 3 1 0
> > > > > 3 4 4 0
> > > > > 3 5 2 3
> > > > > .
> > > > > .
> > > > > .
> > > > > .
> > > > >
> > > > --------------------------------------------------------------
> > > > ---------------------------
> > > > >
> > > > > What I wanna do is to create a variable for obs with
> > the order of
> > > > > success greater than 1; this variable indicates whether
> > or not there
> > > > > exists an event outcome equal to 4 during the interval
> > between this
> > > > > success and the previous success. The final data for the example
> > > > > should look like the following
> > > > >
> > > > >
> > > > --------------------------------------------------------------
> > > > ---------------------------
> > > > >
> > > > > id eventid outcome order of success indicator
> > > > > 1 1 1 0 .
> > > > > 1 2 2 1 .
> > > > > 1 3 4 0 .
> > > > > 1 4 2 2 1
> > > > > 2 1 2 1 .
> > > > > 2 2 4 0 .
> > > > > 2 3 4 0 .
> > > > > 2 4 3 0 .
> > > > > 2 5 2 2 1
> > > > > 3 1 2 1 .
> > > > > 3 2 2 2 0
> > > > > 3 3 1 0 .
> > > > > 3 4 4 0 .
> > > > > 3 5 2 3 1
> > > > > .
> > > > > .
> > > > > .
> > > > > .
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Le Wang, Ph.D.
Minnesota Population Center
University of Minnesota
(o) 612-624-5818