UPD: I've reduced the code size.
I've recently found that the following code will generate wrong output.
#include <bitset>
#include <iostream>
const int N = 105;
std::bitset<N> ok[N][N];
int n = 5;
int main() {
ok[2][2].set(2);
for (int i = n; i; i--)
for (int j = i; j <= n; j++) {
ok[i][j] = ok[i][j] | ok[i + 1][j] | ok[i][j - 1];
}
std::cout << ok[2][5][2] << '\n';
return 0;
}
Compiled with -O3 -mtune=skylake -march=skylake
, the code outputs 0
.
However if you simulate the code you will know that the correct answer should be 1
.
Note that the compiler seems to generate wrong sse instruction.
Again, I believe this code is ub-free, and has nothing to do with implementation-defined stuff.
Auto comment: topic has been updated by Xiaohuba (previous revision, new revision, compare).
Further reduced:
I've submitted a bug. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116768
It seems that the trunk branch has been updated, and the bug is fixed. Thank you!
I bet AI couldn't do that
It's good with
O2
. I am always scared to useO3
, now I have an additional reason to keep being scared. While searching I found something related to O3 and AVX probably not being addressed for a long time (>10 years) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49001 . This may not be this issue though, as I tried to adjustalignas
and that didn't help.