Dear William,
thank you for the explanation. I am actually surprized that the
pointers turned out to be 32bit wide on 64bit machines. I thought
Stata stores data by observation, and has an index of pointers, which
shows memory addresses of each observation. Which is very convenient
for sorting, as the observations themselves do not have to be moved,
but rather their pointers can be reordered. But then the dataset is
limited in size to the maximum span of the pointer (4G for 32 bit). So
these pointers must be used for some other purposes.
But anyway, thank you very much for the reply. I will be using 4bytes
per observation to correctly compute memory requirements.
Best,
Sergiy
On 1/29/08, William Gould, StataCorp LP <[email protected]> wrote:
> In responding to a question by Sergiy Radyakin <[email protected]>, I
> just wrote,
>
> WG> [...] Stata allocates a pointer to each observation, and that pointer is 4
> WG> bytes on 32-bit computers and 8 bytes on 64-bit computers. Thus, however
> WG> wide your dataset is, it's memory footprint, per observation, is 4 or 8
> WG> bytes wider than that.
>
> I was wrong about that. The amount added is 4 bytes regardless of whether
> the computer is 32 or 64 bits. I gave the example of auto.dta. Here is
> what I should have written,
>
> For instance, if you type -describe, detail- after using auto.dta, you
> will find that its width is 43 bytes. 43 is what you would get if you
> summed the individual lengths of the variables. Thus, the width of the
> data when stored in memory is 43 plus 4 IN ALL CASES.
>
> It is true that there is a "pointer" to each observation in the data
> when the data are in memory, but that "pointer" is actually stored as
> in observation-offset form, which fits into 4 bytes in all cases even
> on 64-bit computers.
>
> Thus, when Sirgiy asked, "Is pointer size or machine type reported
> somewhere?", I was right in saying that it is returned r(size_ptr) after
> -memory-, but I should have added that Sirgiy should not care because, in
> terms of dataset size, that size of pointers does not matter; in all case,
> you add 4 to the width.
>
> -- Bill
> [email protected]
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/