Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Machine spec for 70GB data
From
William Buchanan <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Machine spec for 70GB data
Date
Sat, 22 Oct 2011 05:41:32 -0700
Gindo,
Contrary to prior responses to your request, the set memory command is unnecessary when using Stata 12. If your dataset is 70GB, you would need at least that much RAM in addition to the RAM necessary for your computer to run.
- Billy
Sent from my iPhone
On Oct 22, 2011, at 4:52, Yuval Arbel <[email protected]> wrote:
> Gindo,
>
> Are you sure the data file is 70GB? I'm using Windows operating system
> and I recently succeded to run a file of 1.29 GB that includes above
> 4 million observations. Here are the few raws from the do file. Just
> make sure to use the "set memory" command:
>
> . do "D:\kingston\public_housing\public_housing_full_20110630.do"
>
> . clear
>
> . clear matrix
>
> . set memory 12500m
> (12800000k)
>
> . insheet using "D:\kingston\public_housing\survivalindexed27-Jun-2011.csv"
> (56 vars, 4086490 obs)
>
> . sort time_index
>
> . stset time_index, id(appt) failure(fail==1)
>
> id: appt
> failure event: fail == 1
> obs. time interval: (time_index[_n-1], time_index]
> exit on or before: failure
>
> ------------------------------------------------------------------------------
> 4086490 total obs.
> 32731 obs. end on or before enter()
> ------------------------------------------------------------------------------
> 4053759 obs. remaining, representing
> 49650 subjects
> 8582 failures in single failure-per-subject data
> 5084887 total analysis time at risk, at risk from t = 0
> earliest observed entry t = 0
> last observed exit t = 114
>
>
>
> On Sat, Oct 22, 2011 at 1:00 PM, Gindo Tampubolon
> <[email protected]> wrote:
>> Dear all,
>>
>> I need to process a large data file [70GB; a few millions obs] with Stata 12 MP8. Mainly to do cross-random effects,individuals and hospitals, where the outcome is length of stay [controlling for no more than a handful of covariates to begin with]. As an approximation, the outcome is treated as continuous i.e. linear mixed models.
>>
>> What kind of machine spec would be needed? Any ideas, information, experience? Would operating system make any difference? I'm open to consider Windows, Linux, OS X.
>>
>> Many thanks,
>> Gindo
>> University of Manchester
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
>
>
>
> --
> Dr. Yuval Arbel
> School of Business
> Carmel Academic Center
> 4 Shaar Palmer Street, Haifa, Israel
> e-mail: [email protected]
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/