Sylvain Friederich <[email protected]> has solved the problem
with float rounding and promoted his float variable to double using
. gen double fixed = round(price*10,1)/10
About the data and the fix, Sylvain writes,
> The variable I am considering is a share price. It will not take on very
> large values, and can have no more than two-digit decimal precision. [...]
>
> . list price in 1/6
>
> 417.8
> 418.68
> 418.9
> 419.28
> 425.35
> 426.55
The reason Sylvain wanted to promote price from float to double was
> [...]although the Editor displays them as above, clicking on those cells
> shows that Stata really holds them as:
417.7999
418.67999
418.89999
419.28
425.35001
426.54999
Sylvain asks,
> For the sake of completeness, would my original intuition (coarse though it
> may have been) of outsheeting and re-insheeting the dataset right away with
> the "double" option have worked?
Yes it would have worked. Displayed with the default %9.0g, the values were
rounded in the desired way, so after -outfile-ing, -infile- would have
seen 417.8, 418.68, 418.9, ...
If Sylvain's desire to promote the variable was based solely on a desire
to have the editor show the full values as 417.8, 418.68, 418.9, ..., rather
than 417.7999, 418.67999, 418.89999, ..., then I have no qualm. The error,
however, was never much. Taking the 418.68 case, the float value stored
differed from the desired 418.68 by about 0.000007324219 (absolutely) and
0.000000017535 (relatively). Relative error is what matters in statistical
calculations and there is not enough to matter.
All that said, we here at Stata have written do-files to do accounting
applications and, for those calculations, is is absolute error that matters.
Moreover, accountants, the IRS, etc., prefer sums be accurate to the penny.
When storing larger numbers, such as 418,680.02, the relative error remains
the same, approximately, but the absolute error grows, in this case to .01125,
which is too much for accountants. In financial datasets, the best way to
store dollar amounts is in pennies as integers. This abolishes round-off
error for addition and substraction and a -long- takes no more memory than a
-float-. If amounts need to exceed $21 million, then use doubles (and still
record pennies).
-- Bill
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/