Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: RE: RE: RE: RE: Combining multiple observations by an ID variable
From
Claude Beaty <[email protected]>
To
"[email protected]" <[email protected]>
Subject
RE: st: RE: RE: RE: RE: Combining multiple observations by an ID variable
Date
Wed, 13 Jun 2012 15:33:04 +0000
Steve,
As suggested, I am including more information about my master dataset. My individual ID variable is " trr_id_code". My follow up visit ID variable is "trr_fol_id_code". Both variables are string. As previously mentioned, I have about 50,000 "trr_id_code" observations and over 350,000 "trr_fol_id_code" observations. Currently, the dataset is in the long form by the "trr_fol_id_code" variable. I would like this dataset to be in the long form by the "trr_id_code" variable instead (the wide form of the "trr_fol_id_code" variable), as I currently have another dataset which is organized in this way and would like to merge the two files. I am using the following code to accomplish this task:
sort trr_id_code
unab vlist:_all
reshape wide `vlist', i(trr_id_code) j(trr_fol_id_code) string
When this code is applied to the master dataset (approximately 70 variables in the variable list), I receive the error code "too many macros". I have attempted to -reshape- after merging by " trr_id_code" and paring down the database to approximately 13,000 "trr_id_code" observations and 30,000 "trr_fol_id_code" observations, but the increased number of variables in my second dataset (460) results in the same error message. Is my code incorrect, or have I reached the limit of Stata's capabilities by having so many variables and/or observations? Any thoughts would be appreciated.
Claude A. Beaty Jr., M.D.
Halsted Surgical Resident
Cardiac Surgery Research Fellow
The Johns Hopkins Hospital
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Steve Nakoneshny
Sent: Tuesday, June 12, 2012 7:00 PM
To: [email protected]
Subject: Re: st: RE: RE: RE: RE: Combining multiple observations by an ID variable
Claude,
It does appear that -reshape- is the way to go here. Without a snippet example of your dataset (or using an example dataset, if preferred) or without the errant code you are trying to execute, people won't be able to provide you particularly valuable feedback.
With that out of the way, I will hazard a guess that you haven't appropriately specified your variables in the reshape command. I might try something in the form like (untested):
bysort patientID: gen visitID=_n
order patientID visitID
reshape wide a-z, i(patientID) j(visitID) // a-z assumes that the common stub vars to be reshaped are in a consecutive order
As Quinn suggests, you may run out of variables depending on which version of Stata you are using. This too will result in an error code being returned. If this is the case, you may have to either drop variables or play around with -collapse- to make it work, I don't know.
Steve
On 2012-06-12, at 4:31 PM, Claude Beaty wrote:
> Mr. Swanquist,
>
> Reshape was something I considered as well. Unfortunately, every time I attempt to run this code I get the error "too many macros". I have stata 12, which I believe is the most updated version. If anyone knows of a way around this, please let me know.
>
> Claude A. Beaty Jr., M.D.
> Halsted Surgical Resident
> Cardiac Surgery Research Fellow
> The Johns Hopkins Hospital
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Swanquist, Quinn Thomas
> Sent: Tuesday, June 12, 2012 5:38 PM
> To: [email protected]
> Subject: st: RE: RE: RE: Combining multiple observations by an ID variable
>
> Fair enough,
>
> If you need the observations to equal the number of visits and you need to keep the data from each visit, you are going to need to use the reshape wide function on the master dataset before the merge. Since you said that you have 70 variables for each visit, you will now have 70 * the max number of visits variables. Depending on your version of Stata you may or may not be able to work with that many variables.
>
> You can get help with this function using:
>
> help reshape
>
> Quinn Swanquist
> [email protected]
>
>
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Claude Beaty
> Sent: Tuesday, June 12, 2012 5:31 PM
> To: [email protected]
> Subject: st: RE: RE: Combining multiple observations by an ID variable
>
> Mr. Swanquist,
>
> It looks like the merger attempt was likely successful, though I'm sure there are some duplicates. However, your suggested code did not help to shift the data so that the total observations equal the number of ID codes instead of the number of visits. I have tried reshaping etc, but there are too many macros to reshape all of the variables. Is there another way? If I can arrange the data in this way, it is easier to compare with my previous file and find duplicate ID codes. As it stands now, it is difficult to tell if duplicate ID codes are due to successive visits or duplications created by the file merger. Thanks
>
> Claude A. Beaty Jr., M.D.
> Halsted Surgical Resident
> Cardiac Surgery Research Fellow
> The Johns Hopkins Hospital
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Swanquist, Quinn Thomas
> Sent: Tuesday, June 12, 2012 5:16 PM
> To: [email protected]
> Subject: st: RE: Combining multiple observations by an ID variable
>
> Do you have an identifier for visit number (if not you could use date).
>
> Sort as follows:
>
> sort IDcode visit
>
> then merge many to one as follows:
>
> merge m:1 IDcode using "usingfile"
>
>
>
> Quinn Swanquist
> [email protected]
>
>
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Claude Beaty
> Sent: Tuesday, June 12, 2012 5:02 PM
> To: [email protected]
> Subject: st: Combining multiple observations by an ID variable
>
> All,
>
> I have a large dataset of observations in which individuals (~40,000 ID codes) were evaluated multiple times (5-10 visit numbers per individual) on over 70 variables. However, the data has been arranged so that each visit number is an observation, instead of each individual ID code as an observation. I need to merge this file with another file sorted by individual ID codes. How do I rearrange this data so that it is arranged by ID codes with consecutive follow up visits? Thanks
>
> Claude A. Beaty Jr., M.D.
> Halsted Surgical Resident
> Cardiac Surgery Research Fellow
> The Johns Hopkins Hospital
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
>
>
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
>
>
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/