Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Re: A problem with reshape
From
"Joseph Coveney" <[email protected]>
To
<[email protected]>
Subject
st: Re: A problem with reshape
Date
Mon, 15 Jul 2013 11:50:33 +0900
ofran almossawi wrote:
I am having some problems with reshape. my data set is rather large
and i cannot make the changes by hand.
I have data that have no unique identifier, see example of my data below:
id disease type
1 A comorbidity
1 B primary
1 C comorbidity
2 D primary
2 E secondary
2 F comorbidity
3 G primary
3 H comorbidity
3 I comorbidity
3 J comorbidity
the problem that i am having is creating comorbidity 1, 2, 3 etc and
reshaping the data into wide format.
I have followed the example at
http://www.stata.com/support/faqs/data-management/problems-with-reshape/
but when I get to the line "by group, sort: replace no = _n" I get a
"0 replacements made" message.
I am not sure where I am going wrong with this or if there is a better
way of doing it.
many thanks in advance.
--------------------------------------------------------------------------------
If I understand what you ultimately want, then it would look something like that
below. There are other ways of getting there, but the one that I show can get
you started.
Joseph Coveney
. input byte id str1 disease str`=length("comorbidity")' type
id disease type
1. 1 A comorbidity
2. 1 B primary
3. 1 C comorbidity
4. 2 D primary
5. 2 E secondary
6. 2 F comorbidity
7. 3 G primary
8. 3 H comorbidity
9. 3 I comorbidity
10. 3 J comorbidity
11. end
.
. *
. * Begin here
. *
.
. /* Column for primary disease */
. preserve
. quietly keep if type == "primary"
. rename disease primary
. tempfile tmpfil0
. quietly save `tmpfil0'
.
. /* Column for secondary disease */
. restore
. preserve
. quietly keep if type == "secondary"
. rename disease secondary
. merge 1:1 id using `tmpfil0', assert(match using) nogenerate noreport
. quietly save `tmpfil0', replace
.
. /* Columns for comorbidities */
. restore
. quietly keep if type == "comorbidity"
. bysort id: generate byte tally = _n
. rename disease comorbidity
. quietly reshape wide comorbidity, i(id) j(tally)
.
. /* Some assembly required */
. merge 1:1 id using `tmpfil0', assert(match using) nogenerate noreport
.
. drop type
. order id primary secondary
. list, noobs sepby(id) abbreviate(20)
+-----------------------------------------------------------------------+
| id primary secondary comorbidity1 comorbidity2 comorbidity3 |
|-----------------------------------------------------------------------|
| 1 B A C |
|-----------------------------------------------------------------------|
| 2 D E F |
|-----------------------------------------------------------------------|
| 3 G H I J |
+-----------------------------------------------------------------------+
.
. exit
end of do-file
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/