Explanation to weird/strange floating point behaviors in C++

Правка en5, от pajenegod, 2020-05-31 16:03:51

TLDR: 32 bit g++ by default does all of its floating point arithmetic with (80 bit) long double.

I'm writing this blog because of the large number of blogs asking about why they get strange floating arithmetic behavior in C++. For example:

"WA using GNU C++17 (64) and AC using GNU C++17" https://codeforces.net/blog/entry/78094

"The curious case of the pow function" https://codeforces.net/blog/entry/21844

"Why does this happen?" https://codeforces.net/blog/entry/51884

"Why can this code work strangely?" https://codeforces.net/blog/entry/18005

and many many more.

The issue is caused by something called excess precision. In C and C++ there are different modes (referred to as methods) of how floating point arithmetic is done, see (https://en.wikipedia.org/wiki/C99#IEEE_754_floating-point_support). You can detect which one is being used by the value of FLT_EVAL_METHOD found in cfloat. In mode 2 (which is what 32 bit g++ uses by default) all floating point arithmetic is done using long double. In mode 0 (which is what 64 bit g++ uses by default) the arithmetic is done using each corresponding type, so there is no excess precision.

Here is a simple example of how to detect excess precision (partly taken from https://stackoverflow.com/a/20870774)

Test for detecting excess precision

If b is rounded (as one would "expect" since it is a double), then the result is zero. Otherwise it is something like 8e-17 because of excess precision. I tried running this in custom invocation. MSVC(C++17), Clang and g++17(64bit) all use mode 0 and round b to 0, while g++11, g++14 and g++17 as expected all use mode 2 and b = 8e-17.

The culprit behind all of this misery is the old x87 instruction set, which only supports (80 bit) long double arithmetic. The modern solution is to on top of this use the SSE instruction set (version 2 or later), which supports both float and double arithmetic. On GCC you can turn this on with the flags -mfpmath=sse -msse2. This will not change the value of FLT_EVAL_METHOD, but it will effectively turn off excess precision, see 81993714.

It is also possible to effectively turn on excess precision with -mfpmath=387, see 81993724.

Теги c/c++, floating-point, weird, strange, obscure, unusual behaviour, excess precision

История

 
 
 
 
Правки
 
 
  Rev. Язык Кто Когда Δ Комментарий
en8 Английский pajenegod 2020-06-02 01:30:31 571
en7 Английский pajenegod 2020-05-31 19:26:56 420
en6 Английский pajenegod 2020-05-31 18:14:03 1551 Tiny change: 'ostream>\n#include <cstdlib>\nusing n' -> 'ostream>\nusing n'
en5 Английский pajenegod 2020-05-31 16:03:51 82
en4 Английский pajenegod 2020-05-30 23:33:06 1 (published)
en3 Английский pajenegod 2020-05-30 23:24:33 626 Tiny change: 'there are multiple differen' -> 'there are differen'
en2 Английский pajenegod 2020-05-30 21:45:15 28
en1 Английский pajenegod 2020-05-30 21:40:28 2584 Initial revision (saved to drafts)