> Daniel Muller <[email protected]> is curious about how dataset size is
> calculated by -describe-:
Bill [email protected] wrote:
> The size reported by -describe- is obtained by
>
>
> 1,692,789 * ( 4 + 4 ) = 13,542,312
> / | \
> # of obs | \
> | \
> width of data plus 4
> 1 float = 4 bytes
>
So generally the size of data, in memory, is
# of obs * (sum of width of variables + 4)
Is the "observation pointer" the only overhead as far as data storage is
concerned?
Salah
> What is the "plus 4"? The size of the data reported by -describe- is the
> size of the memory image of the data and, in transferring the data from
> disk to memory, Stata adds 4 bytes to each and every observation.
> That 4 bytes is for something called an "observation pointer".
Observation
> pointers are one of the things that make Stata fast.
>
> When a dataset is written to disk, the observation pointers are not
written
> because they can be (and need to be) recreated each time the data is used.
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/