Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "David Radwin" <dradwin@mprinc.com> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | st: RE: collapse is too memory demanding |
Date | Mon, 26 Jul 2010 14:11:46 -0700 (PDT) |
Oliver, As a workaround, you might use Roger Newson's -xcollapse-, available from SSC, to collapse two or more subsets of variables to files and then -merge- the collapsed files back together. Use -findit xcollapse- if your Stata version is earlier than 10. David -- David Radwin Research Associate MPR Associates, Inc. 2150 Shattuck Ave., Suite 800 Berkeley, CA 94704 Phone: 510-849-4942 Fax: 510-849-0794 www.mprinc.com > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu [mailto:owner- > statalist@hsphsun2.harvard.edu] On Behalf Of Oliver Jones > Sent: Monday, July 26, 2010 11:43 AM > To: statalist@hsphsun2.harvard.edu > Subject: st: collapse is too memory demanding > > Hi everybody, > this is my first posting and I hope to ask a good question... > How much additional free memory do I need to perfom a -collapse-? > > Problem setting: > I have a dataset containing information about ~20 million people living in > ~180 different > regions and working in ~330 different jobs. The information is given by > ~70 zero/one dummy > variables, like male[yes/no], female[yes/no], age20-25[yes/no], ... > When I try to collapse it like this I get the error that I need more free > memory > > ********** > * begin excerpt code > * > * m_total is a dummy variable taking the value 1 if the person is male > * f_total is a dummy variable taking the value 1 if the person is female > * > collapse (sum) m_total f_total ...(68 more dummy variables), by(aoaa > beruford) fast > * > * > . > . > no room to add more variables because of width > An attempt was made to add a variable that would have increased the > memory required to > store an observation beyond what is currently possible. You have the > following alternatives: > > 1. Store existing variables more efficiently; see help compress. > > 2. Drop some variables or observations; see help drop. (Think of > Stata's data area as > the area of a rectangle; Stata can trade off width and length.) > > 3. Increase the amount of memory allocated to the data area using > the set memory > command; see help memory. > r(902); > * > * > memory > > . memory > bytes > -------------------------------------------------------------------- > Details of set memory usage > overhead (pointers) 159,493,976 8.45% > data 1,455,382,531 77.11% > ---------------------------- > data + overhead 1,614,876,507 85.56% > free 272,560,293 14.44% > ---------------------------- > Total allocated 1,887,436,800 100.00% > -------------------------------------------------------------------- > Other memory usage > set maxvar usage 2,041,738 > set matsize usage 1,315,200 > programs, saved results, etc. 37,424 > --------------- > Total 3,394,362 > ------------------------------------------------------- > Grand total 1,890,831,162 > > * > * > * end code excerpt > ********** > > > > I am grateful for any help. > > Kind regards > Oliver * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/