Блог пользователя Xahandd

Автор Xahandd, история, 7 лет назад, По-английски

Recently i was trying to solve a problem ( Little Pony and Harmony Chest) and i encountered a weird thing.

At first i tried to solve it and i got TLE and then after some debugging(that came to no conclusion) i just tried to tinker with the code to see what happens and so i swapped two of my loop and MAGICALY the code got AC.

Has anyone encountered something like this before? and can anyone explain to me why?

these are the codes (you can compare them to see what is their difference):

AC

TLE

  • Проголосовать: нравится
  • 0
  • Проголосовать: не нравится

»
7 лет назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Auto comment: topic has been updated by Xahandd (previous revision, new revision, compare).

»
7 лет назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Auto comment: topic has been updated by Xahandd (previous revision, new revision, compare).

»
7 лет назад, # |
  Проголосовать: нравится +50 Проголосовать: не нравится

Accessing memory in the correct order can make your code more cache-friendly and thus much faster. If you access many memory cells that are physically close to each other, the whole block can be kept in the cache.

As an example, try the following code:

const int N = 4096;
int A[N][N];

int main() {
    // initialization
    for (int a=0; a<N; ++a) for (int b=0; b<N; ++b) A[a][b] = rand();

    // iteration in the correct order
    int xor1 = 0;
    for (int a=0; a<N; ++a) for (int b=0; b<N; ++b) xor1 ^= A[a][b];
    cout << xor1 << endl;

    // iteration in the wrong order
    int xor2 = 0;
    for (int b=0; b<N; ++b) for (int a=0; a<N; ++a) xor2 ^= A[a][b];
    cout << xor2 << endl;
}

If you iterate in row major order, you are just reading the array from memory "from left to right". If you iterate in column major order, you are jumping around in memory and cache cannot help you.

On my machine, the second iteration takes cca. 40 times longer than the first one.

»
7 лет назад, # |
  Проголосовать: нравится -7 Проголосовать: не нравится

You used C++14 in the TLE solution, and C++11 in AC solution. There are some weird things in runtimes in C++14 vs C++11, pointed out here.