Dear All,
As we know Stata is backwards compatible in reading it's own files.
This is possible because the version of the file format is stored
within the data file. When Stata reads a file, it first checks the
version stored in the file, then processes the file according to the
corresponding routine. If the file was saved in the version of Stata
newer than the one that reads it, no such routine is found and Stata
responds with a message r(610) file "something" not Stata format,
e.g.:
copy http://www.stata-press.com/data/r10/auto.dta x:\delme.dta
use x:\delme.dta
(don't use -webuse- for this example).
This is good and desired. However not all Stata commands seem to
follow this logic. The following do file will crash Stata 9.2
(Windows, last update applied):
/* -------------- begin of MergeBug.do ------------------------------*/
clear
tempfile auto10
copy `"http://www.stata-press.com/data/r10/auto.dta"' `"`auto10'"'
/* sorted by foreign */
sysuse auto,clear
merge foreign using `"`auto10'"'
/* -------------- end of MergeBug.do ------------------------------*/
Both -merge- and -append- don't seem to check for version correctly
and will crash Stata completely ( _with data loss_ = Stata will close
) if the file that is specified after -using- is in unknown format
(e.g. Stata 10). Perhaps this is a result of performance-robustness
tradeoff? where checks where abolished in favour of faster speed of
merging/appending?
This situation can easily occur when a team of researchers work on a
common project and decide to merge their data at some point, but some
of them have already upgraded their software to Stata 10, while others
are still working with Stata 9.
Since there will be no updates fro Stata 9 from Stata, Corp. I suggest
the following as a workaround: when working with outdated Stata
(version 9, 8 and older) precede -merge- and -append- statements with
-describe-. -describe- statement seems to be checking the version
correctly and it will abort the do program with an error, but will
keep Stata running:
....
quietly describe using `"`auto10'"', short
merge foreign using `"`auto10'"'
....
All commands (including -use-) are prone to crashing Stata if the data
file is ill-formed or incomplete (e.g. because of a transmission
error). In the worst case, file may be read-in but crash later, when
the values are being modified (I don't have a robust example of this
yet): e.g. -use- succeeded, but -list- failed after that. Perhaps this
can be fixed in Stata 10.2? Also IMHO it is desirable that -describe
using- tells the version of Stata necessary to read the file.
Thank you,
Sergiy Radyakin
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/