[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: data size - how big

From	[email protected] (William Gould)
To	[email protected]
Subject	Re: st: data size - how big
Date	Fri, 05 Jul 2002 09:00:37 -0500

Daniel Muller <[email protected]> is curious about how dataset size is 
calculated by -describe-:

> I have two files: <b.dta> with 6,771,434 bytes on disk and <b1.asc>
> 3,385,856 bytes (size according to Win Commander).
>
> Stata however says:
>
> Contains data from b.dta
>   obs:     1,692,789
>  vars:             1                          5 Jul 2002 16:40
>  size:    13,542,312 (87.1% of memory free)
> -------------------------------------------------------------------
>               storage  display     value
> variable name   type   format      label      variable label
> -------------------------------------------------------------------
---
> b               float  %9.0g
> -------------------------------------------------------------------
> Sorted by:


The size reported by -describe- is obtained by 


           1,692,789  * ( 4   +    4  )   =    13,542,312
              /           |         \
          # of obs        |          \
                          |           \
                    width of data      plus 4
                   1 float = 4 bytes

What is the "plus 4"?  The size of the data reported by -describe- is the 
size of the memory image of the data and, in transferring the data from 
disk to memory, Stata adds 4 bytes to each and every observation.
That 4 bytes is for something called an "observation pointer".  Observation
pointers are one of the things that make Stata fast. 

When a dataset is written to disk, the observation pointers are not written
because they can be (and need to be) recreated each time the data is used.

-- Bill
[email protected]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: data size - how big
  - From: "S Mahmud" <[email protected]>

Prev by Date: RE: st: RE: programming question
Next by Date: st: the xi: command --- the ugly hack
Previous by thread: st: data size - how big
Next by thread: Re: st: data size - how big
Index(es):
- Date
- Thread