|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: RE: -replace- should not be use with temporary files (was: Comparing datasets)
From |
Steven Samuels <[email protected]> |
To |
[email protected] |
Subject |
Re: st: RE: -replace- should not be use with temporary files (was: Comparing datasets) |
Date |
Thu, 18 Sep 2008 18:39:46 -0400 |
-Sergiy---
What happens if you omit the "replace' in your first example?
-Steve
On Sep 18, 2008, at 4:52 PM, Sergiy Radyakin wrote:
Sometimes it is necessary to replace a tempfile. In a situation like
the following:
//--------------------------------------------------------------------
-----------------------------------
log using mypaper.txt, text replace
tempfile all_years
use data1990, clear
do prepare_year
save "`all_years'" /* we don't write replace here. Stata must ensure
that the file does not exist before I create it, otherwise it is not a
tempname */
foreach year in 1991 1992 1993 1994 1995 1996 1997 1998 1999 {
use data`year', clear
do prepare_year
append using "`all_years'"
save "`all_years'", replace
}
do my_paper
/* tempfile is deleted - no data is left after the file terminates,
except for the log */
log close
//--------------------------------------------------------------------
-----------------------------------
Note that it is not optimal: it is probably better to create tempfiles
for each year, then append them all one by one. But that requires you
creating many temporary names and looping through them in the program:
//--------------------------------------------------------------------
-----------------------------------
log using mypaper.txt, text replace
foreach year in 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 {
use data`year', clear
do prepare_year
tempfile data`year'
save "`data`year''" /* we do not replace any tempfile in this
version */
}
/* we now have 10 files on the disk. note that the last one (1999) is
currently open */
foreach year in 1990 1991 1992 1993 1994 1995 1996 1997 1998 {
append using "`data`year''"
}
do my_paper
log close
//--------------------------------------------------------------------
-----------------------------------
Regarding "number of tempfiles can be anything you want", strictly
speaking you may hit a limit, but practically this is not relevant.
The limit will depend on your OS and disk system (FAT16/FAT32/NTFS).
E.g. if the tempfolder is located on a disk with FAT16 you will hit
the limit at "just 512 files" :smile:
source:http://ask-leo.com/
is_there_a_limit_to_what_a_single_folder_or_directory_can_hold.html
and also here:
http://en.wikipedia.org/wiki/File_Allocation_Table
This is not a very probable situation, because FAT16 is almost
universally replaced since about Win98 times, but there may be similar
limits in other OSes. Also tempfolders notoriously accumulate junk, so
you can hit 65,534 limit of FAT32 even if you create a dozen of
tempfiles in Stata, given that your computer runs for years without a
cleanup and crashes often, living tonns of tempfiles (e.g. mine
contains more than 12,000 today).
AFAIK nobody has ever reported a problem creating a tempfile in Stata
because of the number of files in a folder limit. But I think this
year somebody hit the limit on the number of simultaneously opened
files - 2048.
If you want to make sure you don't destroy your data, it is more
reliable to set "read-only" attribute for those files, because there
are plenty of other opportunities to destroy your data. Stata (correct
me if I am wrong) _never_ modifies read-only files in any way (whether
that is a data file, program file or a log file). AFAIK any OS allows
to mark a file as "read-only".
Finally, you may want to read Bill Gould's detailed explanations on
how Stata manages tempfiles here:
http://www.stata.com/statalist/archive/2007-08/msg01124.html and
compute yourself what is the max number of tempfiles that can be
created within one instance of Stata.
Best regards,
Sergiy Radyakin
/* it's nice to see that Stata people are back after Ike. How bad
was it? */
On Thu, Sep 18, 2008 at 1:22 PM, Rajesh Tharyan
<[email protected]> wrote:
Hi,
When would one want to save replace a temp file. Given that, it
will get
erased at the end of the run? One can just create as many as
needed is it
not? Or would that be inefficient vis-�-vis usage of system
resources?
Rajesh
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Steven
Samuels
Sent: 18 September 2008 18:08
To: [email protected]
Subject: Re: st: RE: -replace- should not be use with temporary
files (was:
Comparing datasets)
Joseph Coveny wrote:
This is news to me. I use -replace- all of the time with temporary
files. What did StataCorp technical support say was the matter with
using
-save . . . , replace- with temporary files?
Steven Samuels wrote (excerpted):
. . . Technical support told me that "replace" should not be used
when
saving temporary files.
Joseph
What happened was--I tried to save a temporary file `t2', without
first defining it Stata did not issue an error message, and I had no
clue as to where I'd gone wrong. (My only excuse-I was tired.) Kerry
Kammire of StataCorp pointed out my error and went on to say that
there were actually two syntax errors.
"The second syntax error -save, replace- prevented Stata from issuing
an error
when `t2' is undefined. The -replace- option shouldn't be needed when
using
temporary files because they are freshly created each time the
procedure is
run."
Thus -replace- was unnecessary and, in this case, harmful.
-Steve
On Sep 18, 2008, at 12:22 PM, Nick Cox wrote:
This came up on the list a while back.
Suppose you mistype the local macro reference. Say you mean to type
save `myfile', replace
but you have a minute brainstorm and you type `myfil'. Further
suppose
that local macro `myfil' is not defined. Then Stata sees
save, replace
which to Stata is perfectly legal and intelligible. Stata will
overwrite
the original data file, which is not what you intended at all. Of
course, typos here and there can have all sorts of consequences,
all of
which are strictly your fault, but this one could be catastrophic if
what you had in memory was only a small part of the data or
nothing to
do with the dataset you last read in.
There may be other reasons for not doing this, but that's one.
Nick
[email protected]
Joseph Coveney
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/