A_Little_BadBoy's blog

By A_Little_BadBoy, history, 15 hours ago, In English

Recently I've been struggling with 2057E1 - Another Exercise on Graphs (Easy Version). The former solution (307244066) runs in 2400ms, yet the latter one (307245861) finishes the task in only 800ms. I'm really confused by this since the implementation of them are just the same. Could anyone help? Thanks!

  • Vote: I like it
  • +4
  • Vote: I do not like it

»
12 hours ago, # |
  Vote: I like it +20 Vote: I do not like it

Both the implementation have the same time complexity.
However, they have drastically different constant factors.
Note: The assembly is generated on godbolt using x86-64 gcc(trunk) with "-std=c++23 -O2".

This is the loop to blame.

Code 1 loop (slower)
Code 2 loop (faster)

The faster one performs simpler calculations inside the nested loops.
Here is the assembly of the innermost loop for both versions.

Assembly (slower) ~= 48 cpu clock cycles
Assembly (faster) ~= 20 cpu clock cycles

Note that the inner loop is run $$$m \times n^2$$$ or $$$6.4 \times 10^7$$$ times!
So, the slower loop takes about $$$10^9$$$ more clock cycles.

  • »
    »
    12 hours ago, # ^ |
      Vote: I like it +9 Vote: I do not like it

    Wow thank u for your impressive suggestion! It helps me a lot ^.^