Блог пользователя A_Little_BadBoy

Автор A_Little_BadBoy, история, 15 часов назад, По-английски

Recently I've been struggling with 2057E1 - Another Exercise on Graphs (Easy Version). The former solution (307244066) runs in 2400ms, yet the latter one (307245861) finishes the task in only 800ms. I'm really confused by this since the implementation of them are just the same. Could anyone help? Thanks!

  • Проголосовать: нравится
  • +4
  • Проголосовать: не нравится

»
12 часов назад, # |
  Проголосовать: нравится +20 Проголосовать: не нравится

Both the implementation have the same time complexity.
However, they have drastically different constant factors.
Note: The assembly is generated on godbolt using x86-64 gcc(trunk) with "-std=c++23 -O2".

This is the loop to blame.

Code 1 loop (slower)
Code 2 loop (faster)

The faster one performs simpler calculations inside the nested loops.
Here is the assembly of the innermost loop for both versions.

Assembly (slower) ~= 48 cpu clock cycles
Assembly (faster) ~= 20 cpu clock cycles

Note that the inner loop is run $$$m \times n^2$$$ or $$$6.4 \times 10^7$$$ times!
So, the slower loop takes about $$$10^9$$$ more clock cycles.

  • »
    »
    12 часов назад, # ^ |
      Проголосовать: нравится +9 Проголосовать: не нравится

    Wow thank u for your impressive suggestion! It helps me a lot ^.^