Post by bitrex Post by Lynn McGuire
"Sometimes Floating Point Math is Perfect"
Interesting. We moved to 64 bit doubles a couple of decades ago and
have never regretted it.
"If the two constants being added had been exact then there would only
have been one rounding in the calculation and the result would have
matched the literal on the right-hand side."
What does "exact" mean in this context?
In this context it means... er... exact -- without any error.
Post by bitrex
How is writing 98432341293.375 + 0.000244140625 more "exact" than 0.2
Neither the quoted paragraph not the blog post use the term "more
exact". What's more the blog post does not compare 98432341293.375 +
0.000244140625 with 0.2 + 0.3, but with 0.1 + 0.3.
98432341293.375 is binary 1011011101011000001100101010100101101.011.
This can be represented exactly (without error) in a C double and, if
the C implementation conforms to the IEEE floating-point
recommendations, it will be exactly represented. Likewise,
0.000244140625 is exactly .000000000001 and the rules of IEEE arithmetic
say that the sum must be rounded to the nearest (binary) digit. In
fact, the sum can be represented exactly in a C double so "rounding to
the nearest binary digit" means, in the case, giving the exact answer.
With 0.1 + 0.2 there are three places where accuracy is lost. 0.1 can
not be represented exactly in a binary double and neither can 0.2. Both
will be represented by the nearest possible floating-point number, but
neither is exact. Finally, the sum of those two closest-but-not-quite
numbers can not be exactly represented either, giving a third loss of
I leave your example, 0.2 + 0.3 for you to analyse yourself.