Intersting facts on scanf/cin,cout and segment tree implementation on C++20(64) and C++17(32)

Правка en3, от Fyind, 2023-09-01 00:58:51

Problem: 1864F. Exotic Queries

The time limit is 4s. The input size is $$$n,q \le 10^6$$$. The standard solution uses a segment tree and the input/output size is very large. The sync of cin,cout with stdio is turned off in all implementations if cin,cout is used.

Cin,cout vs Fast IO, Scanf on C++20 (64)

  • 220904724 Array implementation with cin,cout on C++20 (64) -> TLE
  • 220952057 Array implementation with Fast IO on C++20 (64) -> 1762 ms
  • 220950571 Array implementation with Scanf on C++20 (64) -> 3181 ms
  • 220950753 Pointer implementation with Scanf on C++20 (64) -> 3868 ms
  • 220947347 Pointer implementation with cin,cout on C++20 (64) -> 1872ms
  • 220954316 Pointer implementation with Fast IO on C++20 (64) -> 2277ms

C++17 (32) vs C++20 (64)

  • 220904724 Array implementation with cin,cout on C++20 (64) -> TLE
  • 220904724 Array implementation with cin,cout on C++17 (32) -> 1621ms
  • 220947347 Pointer implementation with cin,cout on C++20 (64) -> 1872ms
  • 220953695 Pointer implementation with cin,cout on C++17 (32) -> 1934ms

Pointer implementation vs Array implementation

  • 220904724 Array implementation with cin,cout on C++20 (64) -> TLE
  • 220947347 Pointer implementation with cin,cout on C++20 (64) -> 1872ms

Some guessed conclusions

These facts makes no sense to me. Based on the facts, I guess

  • It's better to use Pointer implementation if you use cin,cout on C++20 (64)
  • If you prefer array implementation, use Fast IO or scanf on C++20 (64)
  • The 32/64 bit compiler has some influence on cin,cout performance

Do you guys have some idea why such cases happen? Or did I implement something improper so that it reduced the efficiency?

UPD: Reason of TLE

Thanks areke for pointing out the reason, and related comment and blog.

The thing is, when using cin in a 64-bit compiler, do cin in a separate loop.

Specifically, do

vector<int> vec(n+1);
for (int i = 1;i <= n; ++i) {
    cin >> vec[i];
}
for (int i = 1;i <= n; ++i) {
    pos[vec[i]].push_back(i);
}

instead of

for (int i = 1;i <= n; ++i) {
    int x; cin >> x;
    pos[x].push_back(i);
}

This modification speeds up the code from TLE to 1216ms !!! 221383761

Теги issue, time limit exceeded, cin/cout, segment tree

История

 
 
 
 
Правки
 
 
  Rev. Язык Кто Когда Δ Комментарий
en3 Английский Fyind 2023-09-01 00:58:51 688
en2 Английский Fyind 2023-08-29 13:40:09 5
en1 Английский Fyind 2023-08-29 13:38:24 2586 Initial revision (published)