Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: RE: RE: RE: Combining multiple observations by an ID variable

From	Steve Nakoneshny <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: st: RE: RE: RE: RE: Combining multiple observations by an ID variable
Date	Wed, 13 Jun 2012 09:52:51 -0600

Claude,

The maximum number of variables that Stata can hold is dependent on which version of the software you have access to. Stata/IC can hold up to 2047 variables, whereas Stata/SE and Stata/MP can both hold up to 32,767 variables. If you are using Stata/IC, you will likely run out of variables before too long with this dataset, full stop.

Although I'm not familiar with the "too many macros" error code, from the code you provided, I would suspect that the inclusion of the macro `vlist' in your -reshape- command is causing the problem. I hope others will correct me if I'm wrong, but it looks to me as though the inclusion of the macro in the -reshape- command is trying to both reshape your i and j id codes as well as use them as i and j variables.

Like I suggested yesterday, I would -order trr_id_code trr_fol_id_code- to pull your i and j variables off to the side as the first two variables in the dataset. Thus, your reshape would look like:
reshape wide [third varname] - [last varname], i(trr_id_code) j(trr_fol_id_code) string

Assuming a successful reshape, you could verify the uniqueness of your i var with -isid trr_id_code-. If it turns out to be unique, your subsequent -merge- would most likely end up being a 1:1 merge (assuming the ids are unique in the using dataset as well).

Steve

On 2012-06-13, at 9:33 AM, Claude Beaty wrote:

> Steve,
> 
> As suggested, I am including more information about my master dataset. My individual ID variable is " trr_id_code". My follow up visit ID variable is "trr_fol_id_code". Both variables are string. As previously mentioned, I have about 50,000 "trr_id_code" observations and over 350,000 "trr_fol_id_code" observations. Currently, the dataset is in the long form by the "trr_fol_id_code" variable. I would like this dataset to be in the long form by the "trr_id_code" variable instead (the wide form of the "trr_fol_id_code" variable), as I currently have another dataset which is organized in this way and would like to merge the two files. I am using the following code to accomplish this task:
> 
> sort trr_id_code
> unab vlist:_all
> reshape wide `vlist', i(trr_id_code) j(trr_fol_id_code) string
> 
> When this code is applied to the master dataset (approximately 70 variables in the variable list), I receive the error code "too many macros". I have attempted to -reshape- after merging by " trr_id_code" and paring down the database to approximately 13,000 "trr_id_code" observations and 30,000 "trr_fol_id_code" observations, but the increased number of variables in my second dataset (460) results in the same error message. Is my code incorrect, or have I reached the limit of Stata's capabilities by having so many variables and/or observations? Any thoughts would be appreciated.
> 
> 
> Claude A. Beaty Jr., M.D.
> Halsted Surgical Resident
> Cardiac Surgery Research Fellow
> The Johns Hopkins Hospital
> 
> 
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Steve Nakoneshny
> Sent: Tuesday, June 12, 2012 7:00 PM
> To: [email protected]
> Subject: Re: st: RE: RE: RE: RE: Combining multiple observations by an ID variable
> 
> Claude,
> 
> It does appear that -reshape- is the way to go here. Without a snippet example of your dataset (or using an example dataset, if preferred) or without the errant code you are trying to execute, people won't be able to provide you particularly valuable feedback.
> 
> With that out of the way, I will hazard a guess that you haven't appropriately specified your variables in the reshape command. I might try something in the form like (untested):
> 
> bysort patientID: gen visitID=_n
> order patientID visitID
> reshape wide a-z, i(patientID) j(visitID)	// a-z assumes that the common stub vars to be reshaped are in a consecutive order
> 
> 
> As Quinn suggests, you may run out of variables depending on which version of Stata you are using. This too will result in an error code being returned. If this is the case, you may have to either drop variables or play around with -collapse- to make it work, I don't know.
> 
> 
> Steve
> 
> On 2012-06-12, at 4:31 PM, Claude Beaty wrote:
> 
>> Mr. Swanquist,
>> 
>> Reshape was something I considered as well. Unfortunately, every time I attempt to run this code I get the error "too many macros". I have stata 12, which I believe is the most updated version. If anyone knows of a way around this, please let me know.
>> 
>> Claude A. Beaty Jr., M.D.
>> Halsted Surgical Resident
>> Cardiac Surgery Research Fellow
>> The Johns Hopkins Hospital
>> 
>> 
>> -----Original Message-----
>> From: [email protected] [mailto:[email protected]] On Behalf Of Swanquist, Quinn Thomas
>> Sent: Tuesday, June 12, 2012 5:38 PM
>> To: [email protected]
>> Subject: st: RE: RE: RE: Combining multiple observations by an ID variable
>> 
>> Fair enough, 
>> 
>> If you need the observations to equal the number of visits and you need to keep the data from each visit, you are going to need to use the reshape wide function on the master dataset before the merge. Since you said that you have 70 variables for each visit, you will now have 70 * the max number of visits variables. Depending on your version of Stata you may or may not be able to work with that many variables.
>> 
>> You can get help with this function using:
>> 
>> help reshape
>> 
>> Quinn Swanquist
>> [email protected]
>> 
>> 
>> 
>> 
>> -----Original Message-----
>> From: [email protected] [mailto:[email protected]] On Behalf Of Claude Beaty
>> Sent: Tuesday, June 12, 2012 5:31 PM
>> To: [email protected]
>> Subject: st: RE: RE: Combining multiple observations by an ID variable
>> 
>> Mr. Swanquist,
>> 
>> It looks like the merger attempt was likely successful, though I'm sure there are some duplicates. However, your suggested code did not help to shift the data so that the total observations equal the number of ID codes instead of the number of visits. I have tried reshaping etc, but there are too many macros to reshape all of the variables. Is there another way? If I can arrange the data in this way, it is easier to compare with my previous file and find duplicate ID codes. As it stands now, it is difficult to tell if duplicate ID codes are due to successive visits or duplications created by the file merger. Thanks
>> 
>> Claude A. Beaty Jr., M.D.
>> Halsted Surgical Resident
>> Cardiac Surgery Research Fellow
>> The Johns Hopkins Hospital
>> 
>> 
>> -----Original Message-----
>> From: [email protected] [mailto:[email protected]] On Behalf Of Swanquist, Quinn Thomas
>> Sent: Tuesday, June 12, 2012 5:16 PM
>> To: [email protected]
>> Subject: st: RE: Combining multiple observations by an ID variable
>> 
>> Do you have an identifier for visit number (if not you could use date).
>> 
>> Sort as follows:
>> 
>> sort IDcode visit
>> 
>> then merge many to one as follows:
>> 
>> merge m:1 IDcode using "usingfile"
>> 
>> 
>> 
>> Quinn Swanquist
>> [email protected]
>> 
>> 
>> 
>> 
>> -----Original Message-----
>> From: [email protected] [mailto:[email protected]] On Behalf Of Claude Beaty
>> Sent: Tuesday, June 12, 2012 5:02 PM
>> To: [email protected]
>> Subject: st: Combining multiple observations by an ID variable
>> 
>> All,
>> 
>> I have a large dataset of observations in which individuals (~40,000 ID codes) were evaluated multiple times (5-10 visit numbers per individual) on over 70 variables. However, the data has been arranged so that each visit number is an observation, instead of each individual ID code as an observation. I need to merge this file with another file sorted by individual ID codes. How do I rearrange this data so that it is arranged by ID codes with consecutive follow up visits? Thanks
>> 
>> Claude A. Beaty Jr., M.D.
>> Halsted Surgical Resident
>> Cardiac Surgery Research Fellow
>> The Johns Hopkins Hospital
>> 
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>> 
>> 
>> 
>> 
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>> 
>> 
>> 
>> 
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- RE: st: RE: RE: RE: RE: Combining multiple observations by an ID variable
  - From: Claude Beaty <[email protected]>

References:
- st: Combining multiple observations by an ID variable
  - From: Claude Beaty <[email protected]>
- st: RE: Combining multiple observations by an ID variable
  - From: "Swanquist, Quinn Thomas" <[email protected]>
- st: RE: RE: Combining multiple observations by an ID variable
  - From: Claude Beaty <[email protected]>
- st: RE: RE: RE: Combining multiple observations by an ID variable
  - From: "Swanquist, Quinn Thomas" <[email protected]>
- st: RE: RE: RE: RE: Combining multiple observations by an ID variable
  - From: Claude Beaty <[email protected]>
- Re: st: RE: RE: RE: RE: Combining multiple observations by an ID variable
  - From: Steve Nakoneshny <[email protected]>
- RE: st: RE: RE: RE: RE: Combining multiple observations by an ID variable
  - From: Claude Beaty <[email protected]>

Prev by Date: st: How to include postestimation results (matrix form) in a regression table created by outreg2?
Next by Date: st: Syntax for trimming distribution tail In xtabond2
Previous by thread: RE: st: RE: RE: RE: RE: Combining multiple observations by an ID variable
Next by thread: RE: st: RE: RE: RE: RE: Combining multiple observations by an ID variable
Index(es):
- Date
- Thread