Discussion:
"Sometimes Floating Point Math is Perfect"
Add Reply
Lynn McGuire
2017-07-12 22:47:38 UTC
Reply
Permalink
Raw Message
"Sometimes Floating Point Math is Perfect"

https://randomascii.wordpress.com/2017/06/19/sometimes-floating-point-math-is-perfect/

Interesting. We moved to 64 bit doubles a couple of decades ago and
have never regretted it.

Lynn
Thomas Jahns
2017-07-13 08:14:40 UTC
Reply
Permalink
Raw Message
Post by Lynn McGuire
"Sometimes Floating Point Math is Perfect"
https://randomascii.wordpress.com/2017/06/19/sometimes-floating-point-math-is-perfect/
Interesting. We moved to 64 bit doubles a couple of decades ago and have never
regretted it.
Not really applicable in the comp.lang.fortran forum: Fortran leaves enough
parts of REAL arithmetic unspecified that compilers are sufficiently free to
interpretations of code that give results different from the IEEE
specifications. An example is that the Intel compiler (when optimizing at all)
will compute expressions at compile-time different from run-time.

C and C++ have tighter rules, but that effectively also creates less opportunity
for optimizations.

Thomas
bitrex
2017-07-13 12:23:20 UTC
Reply
Permalink
Raw Message
Post by Lynn McGuire
"Sometimes Floating Point Math is Perfect"
https://randomascii.wordpress.com/2017/06/19/sometimes-floating-point-math-is-perfect/
Interesting. We moved to 64 bit doubles a couple of decades ago and
have never regretted it.
Lynn
"If the two constants being added had been exact then there would only
have been one rounding in the calculation and the result would have
matched the literal on the right-hand side."

What does "exact" mean in this context? How is writing 98432341293.375 +
0.000244140625 more "exact" than 0.2 + 0.3?
Fred.Zwarts
2017-07-13 13:51:30 UTC
Reply
Permalink
Raw Message
Post by Lynn McGuire
"Sometimes Floating Point Math is Perfect"
https://randomascii.wordpress.com/2017/06/19/sometimes-floating-point-math-is-perfect/
Interesting. We moved to 64 bit doubles a couple of decades ago and have
never regretted it.
Lynn
"If the two constants being added had been exact then there would only have
been one rounding in the calculation and the result would have matched the
literal on the right-hand side."
What does "exact" mean in this context? How is writing 98432341293.375 +
0.000244140625 more "exact" than 0.2 + 0.3?
Exact means that the decimal representation can be converted exactly to a
binary representation.
Since 10 can be divided by 5, 1/5 can be represented exactly in decimal
notation (0.2).
But since 2 cannot be divided by 5, 1/5 cannot be represented exactly in
binary notation.
(Just as 1/3 cannot be represented exactly by a decimal notation (because 3
is not a divisor of 10) but needs rounding: 0.3333333333\.)
So, 0.2 cannot be represented exactly by a binary floating point number.
0.000244140625, however, can be represented exactly as a binary floating
point number.
bitrex
2017-07-13 14:16:26 UTC
Reply
Permalink
Raw Message
Post by Fred.Zwarts
Exact means that the decimal representation can be converted exactly to
a binary representation.
Since 10 can be divided by 5, 1/5 can be represented exactly in decimal
notation (0.2).
But since 2 cannot be divided by 5, 1/5 cannot be represented exactly in
binary notation.
(Just as 1/3 cannot be represented exactly by a decimal notation
(because 3 is not a divisor of 10) but needs rounding: 0.3333333333\.)
So, 0.2 cannot be represented exactly by a binary floating point number.
0.000244140625, however, can be represented exactly as a binary floating
point number.
I thought it might be something like that; unfortunately my ability to
mentally do decimal to binary conversions on the fly tops out around 900
million
s***@casperkitty.com
2017-07-13 15:37:04 UTC
Reply
Permalink
Raw Message
Post by bitrex
I thought it might be something like that; unfortunately my ability to
mentally do decimal to binary conversions on the fly tops out around 900
million
The simplest thing to remember is that 1/15 in binary is 0.000100010001...
and so fractions from 2/15 to 14/15 can easily be computed by writing the
numerator as four bits and repeating it. Powers of two, of course, are
accommodated by shifting. Since 1/10 is 3/(15*2), start by computing
3/15 (i.e. 0.001100110011... and then shift right one place, yielding
0.0001100110011.... As a more general approach, values less than 1, of
the form k/((2**n)-1), can be written by writing k repeatedly using n bits,
but beyond 1/7 (i.e. 0.001001001... and 1/15 (0.00010001...) such an
approach is often difficult by hand. 1/100, for example, requires computing
1/25 or 41943/1048575, which is .00001010001111010111 1010001111010111....
Not totally impossible, but not exactly easy.
David Brown
2017-07-13 16:23:42 UTC
Reply
Permalink
Raw Message
Post by s***@casperkitty.com
Post by bitrex
I thought it might be something like that; unfortunately my ability to
mentally do decimal to binary conversions on the fly tops out around 900
million
The simplest thing to remember is that 1/15 in binary is 0.000100010001...
and so fractions from 2/15 to 14/15 can easily be computed by writing the
numerator as four bits and repeating it. Powers of two, of course, are
accommodated by shifting. Since 1/10 is 3/(15*2), start by computing
3/15 (i.e. 0.001100110011... and then shift right one place, yielding
0.0001100110011.... As a more general approach, values less than 1, of
the form k/((2**n)-1), can be written by writing k repeatedly using n bits,
but beyond 1/7 (i.e. 0.001001001... and 1/15 (0.00010001...) such an
approach is often difficult by hand. 1/100, for example, requires computing
1/25 or 41943/1048575, which is .00001010001111010111 1010001111010111....
Not totally impossible, but not exactly easy.
Now, you'll remember that for next time, won't you Bitrex? After all,
it is the simplest thing to remember...
Ben Bacarisse
2017-07-13 14:04:36 UTC
Reply
Permalink
Raw Message
Post by bitrex
Post by Lynn McGuire
"Sometimes Floating Point Math is Perfect"
https://randomascii.wordpress.com/2017/06/19/sometimes-floating-point-math-is-perfect/
Interesting. We moved to 64 bit doubles a couple of decades ago and
have never regretted it.
Lynn
"If the two constants being added had been exact then there would only
have been one rounding in the calculation and the result would have
matched the literal on the right-hand side."
What does "exact" mean in this context?
In this context it means... er... exact -- without any error.
Post by bitrex
How is writing 98432341293.375 + 0.000244140625 more "exact" than 0.2
+ 0.3?
Neither the quoted paragraph not the blog post use the term "more
exact". What's more the blog post does not compare 98432341293.375 +
0.000244140625 with 0.2 + 0.3, but with 0.1 + 0.3.

98432341293.375 is binary 1011011101011000001100101010100101101.011.
This can be represented exactly (without error) in a C double and, if
the C implementation conforms to the IEEE floating-point
recommendations, it will be exactly represented. Likewise,
0.000244140625 is exactly .000000000001 and the rules of IEEE arithmetic
say that the sum must be rounded to the nearest (binary) digit. In
fact, the sum can be represented exactly in a C double so "rounding to
the nearest binary digit" means, in the case, giving the exact answer.

With 0.1 + 0.2 there are three places where accuracy is lost. 0.1 can
not be represented exactly in a binary double and neither can 0.2. Both
will be represented by the nearest possible floating-point number, but
neither is exact. Finally, the sum of those two closest-but-not-quite
numbers can not be exactly represented either, giving a third loss of
accuracy.

I leave your example, 0.2 + 0.3 for you to analyse yourself.
--
Ben.
bitrex
2017-07-13 14:10:55 UTC
Reply
Permalink
Raw Message
Post by Ben Bacarisse
In this context it means... er... exact -- without any error.
Post by bitrex
How is writing 98432341293.375 + 0.000244140625 more "exact" than 0.2
+ 0.3?
Neither the quoted paragraph not the blog post use the term "more
exact". What's more the blog post does not compare 98432341293.375 +
0.000244140625 with 0.2 + 0.3, but with 0.1 + 0.3.
98432341293.375 is binary 1011011101011000001100101010100101101.011.
Ah, naturally. Thanks for clearing that up
Siri Cruise
2017-07-13 15:08:03 UTC
Reply
Permalink
Raw Message
Post by bitrex
Post by Lynn McGuire
"Sometimes Floating Point Math is Perfect"
https://randomascii.wordpress.com/2017/06/19/sometimes-floating-point-math-i
s-perfect/
Interesting. We moved to 64 bit doubles a couple of decades ago and
have never regretted it.
Lynn
"If the two constants being added had been exact then there would only
have been one rounding in the calculation and the result would have
matched the literal on the right-hand side."
What does "exact" mean in this context? How is writing 98432341293.375 +
0.000244140625 more "exact" than 0.2 + 0.3?
Any real number of the form m*2^n, integers m and n, can be exactly represented
in radix two in log2 m + log2 n digits, the near universal radix of computer
numbers. These numbers are closed under addition, subtraction, multiplication
but not division.

It can be represented exactly as computer real number if the representation has
at least log2 m fraction bits and log2 n exponent bits.

0.2 = 2*10^-1 = 2*2^-1*5^-1 = (1*2^0) / (5*2^0), and this division is not
closed. Same for 0.3.
--
:-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
'I desire mercy, not sacrifice.' /|\
Free the Amos Yee one. This post / \
Yeah, too bad about your so-called life. Ha-ha. insults Islam. Mohammed
bitrex
2017-07-13 16:50:33 UTC
Reply
Permalink
Raw Message
Post by Siri Cruise
Post by bitrex
What does "exact" mean in this context? How is writing 98432341293.375 +
0.000244140625 more "exact" than 0.2 + 0.3?
Any real number of the form m*2^n, integers m and n, can be exactly represented
in radix two in log2 m + log2 n digits, the near universal radix of computer
numbers. These numbers are closed under addition, subtraction, multiplication
but not division.
It can be represented exactly as computer real number if the representation has
at least log2 m fraction bits and log2 n exponent bits.
0.2 = 2*10^-1 = 2*2^-1*5^-1 = (1*2^0) / (5*2^0), and this division is not
closed. Same for 0.3.
Thanks, I was unsure because IIRC it wasn't stated explicitly in the
article that just the standard properties of floating point
representations of binary numbers was what the author was talking about,
and not something specific to IEEE-754 (which I'm not terribly familiar
with.) I guess I thought that paragraph (and example numbers the author
used) wasn't particularly well-worded, but my educational background
isn't in computer science.
Chris M. Thomasson
2017-07-14 02:38:32 UTC
Reply
Permalink
Raw Message
Post by Lynn McGuire
"Sometimes Floating Point Math is Perfect"
https://randomascii.wordpress.com/2017/06/19/sometimes-floating-point-math-is-perfect/
Interesting. We moved to 64 bit doubles a couple of decades ago and
have never regretted it.
Thank you for posting this Lynn. :)
Lynn McGuire
2017-07-14 18:39:22 UTC
Reply
Permalink
Raw Message
Post by Chris M. Thomasson
Post by Lynn McGuire
"Sometimes Floating Point Math is Perfect"
https://randomascii.wordpress.com/2017/06/19/sometimes-floating-point-math-is-perfect/
Interesting. We moved to 64 bit doubles a couple of decades ago and
have never regretted it.
Thank you for posting this Lynn. :)
You are welcome.

Lynn

Loading...