Patrick Joly <[email protected]> asked about writing a Perl script to
read/write Stata's missing values and then later, answered his own question.
Patrick is obviously facil and working with IEEE floating-point numbers,
which is how ouir computers store numbers such as 1.5, 3.14159, etc.
Just in case any of you every have to work directly with the IEEE format,
I wanted to tell you about one documented features and four more undocumented
features in Stata that can help with understanding.
The documented feature is the %21x format. It displays floating-point numbers
in a "readable" way that is a bit-by-bit accurate representation of the
number the computer really has stored:
. display %21x 1.5
+1.8000000000000X+000
Numbers are displayed in hexadecmial multiplied by a power of 2:
+1.8000000000000X+000
--------------- ---
base 16 number \
power of 2
Ergo, the above number is (1 + 8/16) * 2^0 = 1.5 in decimal
Pi in the %21x format looks like this:
. display %21x _pi
+1.921fb54442d18X+001
The %21x format is an accurate representation, but it is nonetheless a
translation for the IEE format. The point of %21x is really numerical
analysis. If one is going to analyze the round-off error in some calculated
result, it is really best to look at the number in the same base the computer
uses. For instance, in %21x, one can readily see the effects of using
float precision:
. display %21x float(_pi)
+1.921fb60000000X+001
This is rather lost in the base-10 translation, where _pi in %18.0g is
3.141592653589793 and float(_pi) is 3.141592741012573:
Value of Pi
in %21x in %18.0g
--------------------------------------------------------
double +1.921fb54442d18X+001 3.141592653589793
float +1.921fb60000000X+001 3.141592741012573
-------
\
float looses 7 hex digits
As another example of the use of %21x: How much inaccuracy is in there in
the calculation sqrt(2)^2?
. display %21x 2 _n %21x sqrt(2)^2
+1.0000000000000X+001
+1.0000000000001X+001
Answer: 1 bit.
By the way, %21x can be used as an *INPUT* format as well as an output format
in Stata, and you can even use it in expressions:
. display (1.921X+1)/2
1.5705566
This is a great way to introduce constants in programs and be sure that you
are using the same constant across platforms.
The %21x notation was "invented" here at Stata. At least, I have never seen
this compact notation used in any computer science or numerical analysis book.
The four undocumented formats I want to mention are %16H, %16L, %8H, and %8L.
These are exactly the bits of the floating-point number in IEEE format.
%16H and %16L show the number in 8-byte format (double); %8H and %8L show
the number in 4-byte (float) format. H shows the number written from
left-to-right as is done by Suns and Macintoshes, L shows the number written
from right-to-left as is done by Intel-based computers.
The %16H format is almost readable (if you know what to look for), the %16L
nearly always confuses me because you read right-to-left little chunks that
are themselves written left-to-right, and I find the %8H and %8L formats
unreadable because of a bit-shift in the IEEE 4-byte format. It does not
matter, however, because this is what the computer wants to see:
. display %16H _pi
400921fb54442d18
. display %8H _pi
40490fdb
. display %16L _pi
182d4454fb210940
. display %8L _pi
db0f4940
Patrick Jolly might have found these last four formats useful were he not
so facil with IEEE format. It is convenient that Stata includes both H and
L, so one does not have to visit different computers to see the numbers
left-to-right or right-to-left. The %{8|16}{H|L} formats cannot be used
as input formats, however.
-- Bill
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/