I am messing around in godbolt.org (compiler), and I tried to compile this code with -O2
unsigned int func(unsigned int x) {
return x / 7;
}
Surprisingly, this is what I get
func:
mov eax, edi # edi is argument (x). Move x to eax (register)
imul rax, rax, 613566757 # eax *= 613566757 (unsigned 64-bit multiplication, rax is 64-bit version of eax)
shr rax, 32 # eax >>= 32 (gets upper 32 bit of product)
sub edi, eax # x -= eax
shr edi # x >>= 1
add eax, edi # eax += x
shr eax, 2 # x >>= 2
ret # return x
What is imul rax, rax, 613566757
doing? Where does the constant come from?
Also, this translated code works as intended:
#include<bits/stdc++.h>
using namespace std;
typedef long long ll;
int main() {
int i = 137;
long long edi = i;
long long eax = i;
eax *= 613566757LL;
eax >>= 32;
edi -= eax;
edi >>= 1;
eax += edi;
eax >>= 2;
cout << i << " " << i / 7 << " " << eax << endl;
}
Thank you very much.
(Originally I had int instead of unsigned int, but it's more complicated)
https://stackoverflow.com/questions/41183935/why-does-gcc-use-multiplication-by-a-strange-number-in-implementing-integer-divi
Thank you for the link, should've tried harder googling haha.
A direct link to how it works: https://homepage.divms.uiowa.edu/~jones/bcd/divide.html
So, when compiled with -O2, division is as fast as multiplication and isn't as expensive?
As you can see as in the divide by 7 example, it's optimization is not just one multiplication, but one multiplication and a couple of additions and shifts. So it's a little more expansive than multiplication, although not much.
However, this optimization only works when the divisor is known at compile time, otherwise the compiler cannot figure out the factors, shifts and additions!
Also, as with all compiler optimizations, you can't expect the compiler to always perform this optimization.