The Stata 7 Manual U 16.10 briefly mentions that some problems can arise
due to the fact that STATA stores numbers in single precision and
estimates them in double precision.
I have found another situation where double precision is required in
STATA for some numbers that would seem to be safely manipulated with
single precisions.
Consider a typical -- hypothetical --Social Security number stored as a
string variable.
123-45-6789
On might wish to convert this to a numeric variable for easier
manipulation such as doing sorts in a more appealing manner.
If one parses the SSN into its numeric components, multiplies them up
to the appropriate scale and then adds them back together, the result
is a bit surprising. This process is better done in double precsion to
get the expected result.
list ssnchar /* from an external file of hypothtical SSNs*/
/* parse the string variable ssnchar into its component parts, multiply
them up to the appropriate position in the future number and then add
the parts */
. gen double p1=real(substr(ssnchar,1,3))
. gen double p2=real(substr(ssnchar,5,2))
. gen double p3=real(substr(ssnchar,8,4))
. gen double ssndbl=p1*1000000+p2*10000+p3
. format ssndbl %9.0f
/* with double precision the results are as expected */
. list
ssnchar p1 p2 p3 ssndbl
1. 123-45-6789 123 45 6789 123456789
2. 987-65-4321 987 65 4321 987654321
3. 078-94-5612 78 94 5612 78945612
4. 321-65-7894 321 65 7894 321657894
5. 978-54-6231 978 54 6231 978546231
/* if we use the float form of the number , the resulting variable ssnf
is not what might be anticipated */
. gen p1f=real(substr(ssnchar,1,3))
. gen p2f=real(substr(ssnchar,5,2))
. gen p3f=real(substr(ssnchar,8,4))
. gen ssnf=p1f*1000000+p2f*10000+p3f