[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Floating-point reals, hexadecimal and decimals

From	[email protected] (William Gould, Stata)
To	[email protected]
Subject	st: RE: Floating-point reals, hexadecimal and decimals
Date	Mon, 07 Jul 2003 08:13:09 -0500

Patrick Joly <[email protected]> asked about writing a Perl script to 
read/write Stata's missing values and then later, answered his own question.

Patrick is obviously facil and working with IEEE floating-point numbers, 
which is how ouir computers store numbers such as 1.5, 3.14159, etc.
Just in case any of you every have to work directly with the IEEE format, 
I wanted to tell you about one documented features and four more undocumented
features in Stata that can help with understanding.

The documented feature is the %21x format.  It displays floating-point numbers
in a "readable" way that is a bit-by-bit accurate representation of the 
number the computer really has stored:

        . display %21x 1.5
        +1.8000000000000X+000

Numbers are displayed in hexadecmial multiplied by a power of 2:

        +1.8000000000000X+000
         ---------------  ---
         base 16 number    \
                            power of 2

         Ergo, the above number is (1 + 8/16) * 2^0  =  1.5 in decimal

Pi in the %21x format looks like this:

        . display %21x _pi
        +1.921fb54442d18X+001

The %21x format is an accurate representation, but it is nonetheless a
translation for the IEE format.  The point of %21x is really numerical
analysis.  If one is going to analyze the round-off error in some calculated
result, it is really best to look at the number in the same base the computer
uses.  For instance, in %21x, one can readily see the effects of using 
float precision:

        . display %21x float(_pi)
        +1.921fb60000000X+001

This is rather lost in the base-10 translation, where _pi in %18.0g is
3.141592653589793 and float(_pi) is 3.141592741012573:

                                    Value of Pi
                          in %21x                 in %18.0g
        --------------------------------------------------------
        double      +1.921fb54442d18X+001      3.141592653589793 
        float       +1.921fb60000000X+001      3.141592741012573
                             -------
                                \
                                 float looses 7 hex digits

As another example of the use of %21x:  How much inaccuracy is in there in 
the calculation sqrt(2)^2?

        . display %21x 2 _n %21x sqrt(2)^2
        +1.0000000000000X+001
        +1.0000000000001X+001

Answer:  1 bit.

By the way, %21x can be used as an *INPUT* format as well as an output format
in Stata, and you can even use it in expressions:

        . display (1.921X+1)/2
        1.5705566

This is a great way to introduce constants in programs and be sure that you
are using the same constant across platforms.

The %21x notation was "invented" here at Stata.  At least, I have never seen
this compact notation used in any computer science or numerical analysis book.

The four undocumented formats I want to mention are %16H, %16L, %8H, and %8L.
These are exactly the bits of the floating-point number in IEEE format.
%16H and %16L show the number in 8-byte format (double); %8H and %8L show 
the number in 4-byte (float) format.  H shows the number written from 
left-to-right as is done by Suns and Macintoshes, L shows the number written
from right-to-left as is done by Intel-based computers.

The %16H format is almost readable (if you know what to look for), the %16L
nearly always confuses me because you read right-to-left little chunks that
are themselves written left-to-right, and I find the %8H and %8L formats
unreadable because of a bit-shift in the IEEE 4-byte format.  It does not
matter, however, because this is what the computer wants to see:

        . display %16H _pi
        400921fb54442d18

        . display %8H _pi
        40490fdb

        . display %16L _pi
        182d4454fb210940

        . display %8L _pi
        db0f4940

Patrick Jolly might have found these last four formats useful were he not 
so facil with IEEE format.  It is convenient that Stata includes both H and 
L, so one does not have to visit different computers to see the numbers 
left-to-right or right-to-left.  The %{8|16}{H|L} formats cannot be used 
as input formats, however.

-- Bill
[email protected]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: summ, format
  - From: "SJ Friederich, Economics" <[email protected]>

Prev by Date: st: Panel data with ARMA(p,q) disturbances
Next by Date: st: series of value labels and collapse
Previous by thread: st: RE: Floating-point reals, hexadecimal and decimals
Next by thread: st: summ, format
Index(es):
- Date
- Thread