Stas Kolenikov ([email protected]) asked about the -every()- option
of -postfile-:
> So what exactly does -every()- option of -postfile- does? Here's my
> concern: I am starting remotely some simulations up on a computing
> server, and sometimes the jobs are killed (either by me when I see it
> hangs up, or by the server robots when it runs out of time, or for any
> other reason). All I get, then, is the header of the file created by
> -postfile-, but none of the results that supposedly came from -post-
> commands, even if I specify -every(1)- option. (There is some
> bootstrapping involved in this particular project, so it indeed might
> take a few minutes to get the next data point to be -post-ed.) I am
> suspecting that I am getting nothing since -postclose- fails to
> properly close the file; however I have an idea stuck in my head (from
> some NC courses about 10 years ago?) that one can be somewhat sloppy
> with -postclose- since whenever the process closes, it works as
> implicit -postclose- for all open -postfile-s. Is that true, or am I
> missing something important? Apparently, if the job is brutally killed
> (KILL (9)? OMG), then none of that happens.
The -every(#)- option of -postfile- is intended to cause -post- to
write its information to disk every # times that it is called.
However, Stas has found a bug related to the -every(#)- option which
causes -post- to cache observations in memory longer than it should
before it writes them to disk. The number supplied to -every(#)- is
being mistakenly multiplied by the width (bytes) of a single
observation being posted. Thus, if Stas is posting 1 int and 3
floats, the width of a single observation is 1*2 + 3*4 = 14. When
Stas specifies -postfile ..., every(1)-, that '1' is multiplied by
the width of 14, and therefore -post- will only write information
to disk after every 14th call.
We will fix this in a future executable update.
Stas is correct that it is ok to be somewhat sloppy with -postclose-.
Every time -post- writes information to the dataset on disk, it does
so in a way that ensures a partial or corrupt dataset is not left
on disk, even if the process is killed unexpectedly inbetween calls
to -post-. The problem Stas is having is caused by -post- not writing
information to disk as frequently as it should as specified by the
-every(#)- option.
I will contact Stas privately to find out more about exactly what he
is doing so that I can suggest a temporary workaround.
--Alan Riley
([email protected])
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/