Hello,
I have a data set in wide form. Part of the data have to do with work
injuries. After asking whether a person experienced a work injury during
the past 12 months and then asking how many injuries were experienced, the
person reported on a number of potential characteristics of the
injury(ies). The series of characteristics were asked for up to 5 injury
events. Obviously, not everyone had up to 5 injuries. I want to convert
the file into long form where each record represents an injury event.
There are 2535 respondents in the wide file. Had everyone experienced at
least 5 injuries, I would expect the number of records in the long file to
equal 12675. However, the actual number of records should be much lower
than 12675.
This is a partial record for a respondent in wide format:
respnum_ = respondent number
finalwt2 = sampling weight
gender = gender
ir1 = did an injury occur--yes/no
ir2 = number of injuries experienced
ir3_1 = injury characteristic 1 for injury 1 -- yes/no
ir3_2 = injury characteristic 1 for injury 2 -- yes/no
ir3_3 = injury characteristic 1 for injury 3 -- yes/no
ir3_4 = injury characteristic 1 for injury 4 -- yes/no
ir3_5 = injury characteristic 1 for injury 5 -- yes/no
ir4_1_1 = injury characteristic 2 for injury 1 -- yes/no
ir4_1_2 = injury characteristic 3 for injury 2 -- yes/no
ir4_1_3 = injury characteristic 4 for injury 3 -- yes/no
ir4_1_4 = injury characteristic 5 for injury 4 -- yes/no
ir4_1_5 = injury characteristic 3 for injury 5 -- yes/no
respnum_ finalwt2 gender ir1 ir2 ir3_1
ir3_2 ir3_3 ir3_4 ir3_5 ir4_1_1 ir4_1_2 ir4_1_3 ir4_1_4 ir4_1_5
354 16151.351 male yes 2 no yes .
. . no no . . .
To get the data in long format, I executed these commands:
sort respnum_;
reshape long
ir3_@ ir4_1_@,i(respnum_) j(injnum);
I obtained data in this form:
respnum_ injnum finalwt2 gender ir1
ir2 ir3_ ir4_1_
354 1 16151.351 male yes 2
no no
354 2 16151.351 male yes 2
yes no
354 3 16151.351 male yes 2
. .
354 4 16151.351 male yes 2
. .
354 5 16151.351 male yes 2
. .
However, because this person only experienced 2 injuries, I wanted the
data in this form:
respnum_ injnum finalwt2 gender ir1
ir2 ir3_ ir4_1_
354 1 16151.351 male yes 2
no no
354 2 16151.351 male yes 2
yes no
How can I keep the number of records for a given respondent equal to the
number of injuries experienced?
Thanks,
Mike Frone
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/