Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Coding overlap events in sequence data

From	Nick Cox <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: st: Coding overlap events in sequence data
Date	Wed, 13 Nov 2013 10:01:01 +0000

There is an easy thing you can do. With your sample data


reshape long sex, i(id) string
rename sex age
drop if age == .
bysort id (age) : gen number = sum((substr(_j, 1, 1) == "B") -
(substr(_j,1,1) == " E"))
l

So, you can keep a running tally by adding 1 every time someone starts
a relationship (value starts with "B") and subtracting 1 every time it
stops (value starts with "E"). You can summarize by using -egen-, e.g.

egen max = max(number), by(id)

Nick
[email protected]


On 13 November 2013 08:56, Nick Cox <[email protected]> wrote:
> This problem would be easier after a -reshape long-. Your present data
> structure makes it really difficult.
>
> After that, check out
>
> -spellutil- (SSC)
>
> -disjoint- (SSC)
>
> which may help.
> Nick
> [email protected]
>
>
> On 12 November 2013 22:26, Cheng, Hsu-Chih <[email protected]> wrote:
>> Dear All:
>>
>> I am coding sequence data for 5013 respondents’ sexual relationship histories.  For each respondent, I have 16 time positions (ages 18~18.25 [sext1], 18.25~18.50 [sext2],…, 21.75~22 [sext16]) and the respondent’s beginning and end ages of up to 48 relationships (most respondents have fewer than 5 relationships; so the beginning and end ages of the other 40+ relationships have missing values). Right now, respondents with multiple relationships in a given time position are coded as 4 (so, for example, sext5 = 4).  I can manually go through all respondents to determine whether the multiple relationships in a given time overlap or not and recode them into different categories, but this is very tedious and time consuming.  Is there a faster way to do this?
>>
>> Here are four examples with multiple relationships in Time 2 (ages 18.25~18.50).  sexBag1 and sexEag1 indicate the beginning and end ages of relationship 1; sexBag2 and sexEag2 indicate the beginning and end ages of relationship 1;…, and so on.  I want to recode [sext2] for Cases 1 and 2 as 1 to indicate that their relationships in Time 2 do not overlap, and Cases 3 and 4 as 2 to indicate their relationships in Time 2 overlap.
>>
>> id  sext2   sexBag1  sexEag1  sexBag2  sexEag2  sexBag3 sexEag3  sexBag4  sexEag4  sexBag5    sexEag5
>> 1       4      18.5    18.75   18.333   18.416   21.416    21.5    19.25   24.083        .          .
>> 2       4         .        .   18.250   18.333   18.416  21.666   21.583   22.833     22.5   22.50004
>> 3       4    17.249   18.999   18.499   21.999   22.166  23.249   (missing values after this)
>> 4       4    16.750   22.750   18.416   18.666   (missing values after this)
>>
>> I really appreciate if anyone can give me some suggestions. I can provide more information about the data if needed. Thanks again.
>>
>> Best,
>>
>> Simon
>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- RE: st: Coding overlap events in sequence data
  - From: "Cheng, Hsu-Chih" <[email protected]>

References:
- st: Coding overlap events in sequence data
  - From: "Cheng, Hsu-Chih" <[email protected]>
- Re: st: Coding overlap events in sequence data
  - From: Nick Cox <[email protected]>

Prev by Date: st: Delete companies based on observations
Next by Date: Re: st: Delete companies based on observations
Previous by thread: Re: st: Coding overlap events in sequence data
Next by thread: RE: st: Coding overlap events in sequence data
Index(es):
- Date
- Thread