Hello All ----
I am using Intercooled, v.8.2.
I have a dataset that I need to expand, albeit there are some nuances
in the data that preclude a simple execution of -expand-.
The data is currently a combination of "wide" and "long", that is, I
have one record (line) allocated to each subject if he/she did not
experience any events, but if the subject did report any events then
there are as many lines allocated for each event reported. For
instance, I have one subject that reported 600 procedures (not events)
so this subject is only allocated one line but I have multiple subjects
that reported multiple procedures with multiple events and thus occupy
multiple lines --- one for each event reported. My objective is to
-expand- the dataset by procedures (which I can do using -expand-),
although I need to address how the procedures are distributed within
each subject. More to the point, each subject had to indicate the
percentage of time he/she employed a particular type of surgical machine
(as well as machine setting), and the percentage of time he/she employed
a particular surgical approach. Ultimately, I want a "long" dataset
that lists all the procedures for each subject and incorporates the
percentage of time that the subject employed a particular surgical
approach, machine, & machine setting. I've pasted a few observations
from my dataset in its current form for illustration:
+---------------------------------------------------------------------------------------------+
| name surger~s div_con machine percent
setting wound~er |
|------------------------------------------------------------------------------------
|
48. | John Doe 210 100 leg_a 100
pul_a . |
49. | Jane Doe 300 100 sov_a 100
white_a 2 |
50. | Jane Doe . . .
. |
+----------------------------------------------------------------------------------------------+
where 'name' is subject name, 'surger~s' is the number of procedures
reported by the subject, 'div_con' indicates percent of time a surgical
approach was used (I have three of these variables that I did not list
for brevity), 'machine' is the surgical machine used with the respective
percent of time listed in 'percent', and 'setting' is the corresponding
machine setting to surgical machine. The last variable, 'wound~er',
indicates whether the subject had any events and if so, their name is
repeated with each line reporting the relevant information for each
event (e.g. machine & setting used, surgical approach used).
What I'm envisioning is a dataset listing name, machine type, machine
setting, & surgical approach wherein each subject has as many lines as
procedures (surgeries) along with a variable loosely referred to as
'event' that assumes the value one if the attributes of the event mirror
the attributes of the procedure.
I recognize that this problem (and explanation!) are somewhat
convoluted, nevertheless, I appreciate any and all suggestions.
Much obliged,
Clint Thompson
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/