Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Finding patterns of consecutive number
From
Nick Cox <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
RE: st: Finding patterns of consecutive number
Date
Thu, 26 Apr 2012 17:10:55 +0100
Nick
[email protected]
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Marshall Garland
Sent: 26 April 2012 17:04
To: [email protected]
Subject: Re: st: Finding patterns of consecutive number
Hi Nick-
Thanks for the simplified code and the links. This was far easier and
more intuitive than the programming somersaults that I was attempting.
I had read this FAQ, but I was conceptually struggling with how to
apply it to my circumstance.
Cheers,
-mwg
On Thu, Apr 26, 2012 at 3:45 AM, Nick Cox <[email protected]> wrote:
> The code will simplify as
>
> if _n == 1 | (test - test[_n-1] != 1)
>
> could be written
>
> if (test - test[_n-1] != 1)
>
> because -test[0]- will be evaluated as missing. But in practice with
> spell problems, the first observation in a panel often needs explicit
> attention as we know nothing about what preceded it. And code that
> deals explicitly with the first observation is often easier to
> understand.
>
> This may also be of interest:
>
> FAQ . . . . . . Identifying runs of consecutive observations in panel data
> . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox and V. Wiggins
> 8/02 How do I identify runs of consecutive observations
> in panel data?
> http://www.stata.com/support/faqs/data/panel.html
>
> On Thu, Apr 26, 2012 at 2:58 AM, Marshall Garland
> <[email protected]> wrote:
>
>> This is exactly what I needed.
>>
>> Thanks so much for your help and prompt reply.
>
> On Wed, Apr 25, 2012 at 8:24 PM, Nick Cox <[email protected]> wrote:
>
>>> I think of your problem as defining spells of consecutive integers, so
>>> that a spell starts with the first observation in each panel or if the
>>> previous value was not one fewer.
>>>
>>> bysort id (year) : gen progress = string(test) if _n == 1 | (test -
>>> test[_n-1] != 1)
>>> by id : replace progress = progress[_n-1] + string(test) if missing(progress)
>>>
>>> Dealing with spells: see also -tsspell- (SSC) or
>>>
>>> SJ-7-2 dm0029 . . . . . . . . . . . . . . Speaking Stata: Identifying spells
>>> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
>>> Q2/07 SJ 7(2):249--265 (no commands)
>>> shows how to handle spells with complete control over
>>> spell specification
>>>
>>> By the way, as the putative author of -tostring-, I note that it is
>>> overkill here. The -string()- function is all you need.
>
> On Wed, Apr 25, 2012 at 11:47 PM, Marshall Garland
>>> <[email protected]> wrote:
>>>
>>>> I have panel student testing data spanning six years. Each year, I
>>>> have a unique student and student test level and outcome. Testing
>>>> levels across years are not necessarily consecutive, nor are years.
>>>> For each student, in each year, I'd like to create a variable that
>>>> captures the longitudinal test progression for each student, in each
>>>> year. However, for each year, I'd like the maximum consecutive test
>>>> progression, without disruptions. This maximum test progression should
>>>> only be calculated for consecutive years, too.
>>>>
>>>> I've posted my data at the end of this message, which will help
>>>> describe my objective. For student A, in 2008/09, her test progression
>>>> is 6543, since she had 4 consecutive years of test data. This is
>>>> perfect. Student B, however, in 2008/09, has a test progression of
>>>> 7643. However, I only want to record, for student B, the maximum
>>>> consecutive test progression, which is 76 and ignore the 43. The 43
>>>> progression will be captured in the corresponding year (2006/07).
>>>>
>>>> I can't figure out a way to adjust for this discontinuity. I've tried
>>>> a number of things, including this. But, this still captures repeated
>>>> test levels across years (student C below, in 2008/09).
>>>>
>>>> Thanks for help in advance.
>>>>
>>>> Cheers,
>>>>
>>>> -mwg
>>>>
>>>> /****************************************************
>>>> bys research_id: gen test_t=d.test_level_2
>>>> bys research_id: egen max_test_t=max(test_t)
>>>>
>>>> ///group creation for consecutive runs
>>>> forvalues i=0/6 {
>>>> gen group_`i'=.
>>>> bys research_i (sch_yr): replace group_`i'=test_level_2[_n-`i'] if
>>>> max_test_t==1 & test_t==1
>>>> tostring group_`i', replace
>>>> replace group_`i'="" if group_`i'=="."
>>>> }
>>>>
>>>> egen group_ty_cons=concat(group_0- group_6)
>>>> tab group_ty_cons
>>>> /**************************************************************
>>>>
>>>> Here's my data:
>>>> student year test_level progression
>>>> A 2005/06 3
>>>> A 2006/07 4 43
>>>> A 2007/08 5 543
>>>> A 2008/09 6 6543
>>>> B 2005/06 3
>>>> B 2006/07 4 43
>>>> B 2007/08 6 643
>>>> B 2008/09 7 7643
>>>> C 2005/06 6
>>>> C 2006/07 7 76
>>>> C 2007/08 8 876
>>>> C 2008/09 8 8876
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/