In responding to a question by Sergiy Radyakin <[email protected]>, I
just wrote,
WG> [...] Stata allocates a pointer to each observation, and that pointer is 4
WG> bytes on 32-bit computers and 8 bytes on 64-bit computers. Thus, however
WG> wide your dataset is, it's memory footprint, per observation, is 4 or 8
WG> bytes wider than that.
I was wrong about that. The amount added is 4 bytes regardless of whether
the computer is 32 or 64 bits. I gave the example of auto.dta. Here is
what I should have written,
For instance, if you type -describe, detail- after using auto.dta, you
will find that its width is 43 bytes. 43 is what you would get if you
summed the individual lengths of the variables. Thus, the width of the
data when stored in memory is 43 plus 4 IN ALL CASES.
It is true that there is a "pointer" to each observation in the data
when the data are in memory, but that "pointer" is actually stored as
in observation-offset form, which fits into 4 bytes in all cases even
on 64-bit computers.
Thus, when Sirgiy asked, "Is pointer size or machine type reported
somewhere?", I was right in saying that it is returned r(size_ptr) after
-memory-, but I should have added that Sirgiy should not care because, in
terms of dataset size, that size of pointers does not matter; in all case,
you add 4 to the width.
-- Bill
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/