Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Creating a contact tracing file (levelsof & nested foreach loops)
From
"Puddicombe, David" <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
st: Creating a contact tracing file (levelsof & nested foreach loops)
Date
Fri, 28 Sep 2012 08:40:37 -0700
Hi,
I would like to create a file that indicates which cases of an infectious disease outbreak may have been the source of infection for later cases.
The variables in my dataset are
id_n (non-identifying id, values 1 to 83)
date_onset (%td date of onset of disease variable)
city (city where each case resides, numeric variable)
Cases occurred in seven cities:
City | Freq. Percent Cum.
------------+-----------------------------------
city_1 | 46 55.42 55.42
city_2 | 2 2.41 57.83
city_3 | 12 14.46 72.29
city_4 | 3 3.61 75.90
city_5 | 18 21.69 97.59
city_7 | 1 1.20 98.80
city_8 | 1 1.20 100.00
------------+-----------------------------------
Total | 83 100.00
Within each city, I would like to create indicator variables (0 No 1 Yes) to describe which of the subsequent cases may have been infected by each case and to generate variables with the id number of each potential subsequent case.
The rules to define whether a case might be the source of infection for a subsequent case are:
they are in the same city; and
7<=(date_onset[case2] - date_onset[case1])<=18
For example, case 1 in city_1 could be the source of infection for the other 45 cases in city_1. I would like to create two variables for each of these 45 potential contacts(potential_contact_n_yn; potential_contact_n_id). For case 2 in city_1, there are 44 potential subsequent cases etc.
I've tried several levelsof and foreach loops but cannot make my code work. My aim is to create a wide dataset describing the relationships between cases within each city and then transpose the file from wide to long with one row for each potential source of infection to use in a network analysis program (cytoscape).
Any suggestions/help would be greatly appreciated. Some sample data are below.
id_n date_onset city
1 09mar2010 city_1
2 10mar2010 city_1
3 11mar2010 city_1
4 18mar2010 city_1
5 18mar2010 city_1
6 18mar2010 city_1
7 18mar2010 city_1
8 19mar2010 city_1
9 19mar2010 city_1
10 20mar2010 city_1
11 21mar2010 city_1
12 21mar2010 city_1
13 22mar2010 city_1
14 22mar2010 city_1
15 22mar2010 city_1
16 22mar2010 city_1
17 22mar2010 city_2
18 24mar2010 city_1
19 25mar2010 city_3
20 25mar2010 city_7
21 25mar2010 city_1
22 25mar2010 city_1
23 26mar2010 city_1
24 26mar2010 city_1
25 26mar2010 city_3
26 27mar2010 city_3
27 28mar2010 city_5
28 28mar2010 city_3
29 29mar2010 city_1
30 29mar2010 city_1
Best wishes,
David Puddicombe
Communicable Disease Epidemiologist
[email protected]
Tel 604 707 2537
Fax 604 707 2515
Immunization Programs and Vaccine Preventable Diseases Service
BC Centre for Disease Control
655 West 12th Avenue, Vancouver, BC Canada V5Z 4R4
www.bccdc.ca
www.immunizebc.ca
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/