Libraion's blog

By Libraion, history, 4 years ago, In English

Recently, a friend of mine do this problem 1137C.

His solution involves using dfs for Tarjan algorithm. As the number of vertices is pretty big, about $$$50 \times 10^5$$$, the submission without inline get RTE 123001685. Depend on the exit code, I doubt that it get stack overflow error. And after putting inline in front of the dfs function for tarjan, it gets AC 123002929.

Not only that, here are 2 submissions (my friend took from someone) that is completely the same but with RTE and AC status. RTE: 123002279, AC: 123002352. The only difference is he added 3 non-sense lines into the RTE code and it got AC.

I have some questions in mind:

  • Does Stack overflow error the main thing that caused RTE?

  • Inline is supposed to be automatically added by the compiler when possible, isn't it? (As I heard somebody says so)

  • Why putting those 3 non-sense lines work the same as adding inline?

  • Vote: I like it
  • +75
  • Vote: I do not like it

»
4 years ago, # |
  Vote: I like it +32 Vote: I do not like it

Inline is supposed to be automatically added by the compiler when possible, isn't it?

Yes, it will, but only if it compiles with -O3 flag. Codeforces uses -O2 flag and your friend doesn't change it with #pragma GCC optimize ("O3"), so compiler may not inline it.

Why putting those 3 non-sense lines work the same as adding inline?

Compiler is a fun thing, it can just simply not listen to you, and do what is better in its own mind. If I recall correctly, compiler chooses whether it should inline a function or not based on func code size, so adding additional code might make it think that it should inline a function. If that's not the case, then idk, "fuck" is a special keyword, that get your solution AC

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +23 Vote: I do not like it

    Added #pragma GCC optimization ("O3"): 123037331, but still RTE. Sorry if I missed something.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +16 Vote: I do not like it

      Turns out optimization level is not a guarantee of function being inlined, but -O3 contains -finline-functions which makes compiler consider all functions for inlining even if they are not declared inline. And that's really weird, because solution only works if you inline tarjan function ,even when you use pragma, maybe codeforces somehow ignores it, but I highly doubt it. And it gets even weirder because inlining only push function in other solution results in RTE, so I actually have no idea why these 3 lines help. Maybe you actually need to insult GCC compiler to make it work.

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it +22 Vote: I do not like it

        Thanks for your reply. At least we would know to try to put inline before function with many recursions when getting RTE.

»
4 years ago, # |
  Vote: I like it -25 Vote: I do not like it

Why bad word! (╥﹏╥)

»
4 years ago, # |
Rev. 2   Vote: I like it +14 Vote: I do not like it

It seems like it is just stack overflow. One quick way to check is to printf("%d\n", &u) within each void tarjanSCC(int u) to see how fast the stack is growing downwards.

I tested this in custom invocation: the AC code with the while loop uses 48 bytes per recursive call, the original RTE uses 80 bytes, and I also tried #pragma GCC optimize("Os") which uses 64 bytes. But only the 48 byte version fits since $$$50 * 10^5 * 48$$$ bytes is 240 megabytes and codeforce's stack limit is 256 mb.

I don't really understand what is using up the extra stack space though. My guess is that there is a heuristic for when stuff like u / d and u % d should be saved to the stack so they don't have to be recalculated after the recursive call returns? You can check godbolt to find out for sure.

»
4 years ago, # |
  Vote: I like it +25 Vote: I do not like it

Adding those 3 lines is not the same as inlining tarjanSCC. They seem to be different things. Adding a while-loop forces push function to not be inline. See 123078450.

I am not sure about the reasons. But these codes are interesting examples. There is a default-inlined push function and a normal tarjanSCC. To reduce stack space usage, you can inline tarjanSCC or force push to not be inline.