physics0523's blog

By physics0523, history, 2 years ago, In English

If we want to use a 2D dynamic array, some C++ user writes following 2D vector:

vector<vector<int>> a(n);

// add y to a[x]
a[x].push_back(y);

// walkthrough a[x]
for(auto &nx : a[x]){
  cout << nx << "\n";
}

But we can do the equivalent thing with handwritten linked list (when we know the total size) by using 1D vector<pair<int,int>>:

pair<int,int> empty={-1,-1};
vector<pair<int,int>> b(n+total_size,empty);
int add_point=n;

// add y to b[x]
if(b[x]!=empty){
  b[add_point]=b[x];
  b[x].second=add_point;
  add_point++;
}
b[x].first=y;

// walkthrough b[x]
int current=x;
while(current!=empty.second && b[currnt]!=empty){
  cout << b[current].first << "\n";
  current=b[current].second;
}

So, what's the benefit for the later implementation? The merit is the generation time of later one is very fast.

vector size = $$$10^5$$$, Total elements = $$$10^6$$$ (this means we need vector<vector<int>>(1e5) and the sum of the size of $$$10^5$$$ vectors is $$$10^6$$$)

generate walkthrough
vector<vector<int>> 49.04 ms 3.12 ms
vector<pair<int,int>> 12.78 ms 19.52 ms

vector size = $$$10^6$$$, Total elements = $$$10^6$$$

generate walkthrough
vector<vector<int>> 138.56 ms 10.92 ms
vector<pair<int,int>> 17.4 ms 7.8 ms

vector size = $$$10^6$$$, Total elements = $$$10^7$$$

generate walkthrough
vector<vector<int>> 1406.84 ms 32.86 ms
vector<pair<int,int>> 120.48 ms 437.28 ms

( on CF judge(C++20 64bit), x10 average, experiment code )

So, when the vector size is large and need small number of walkthrough, the vector<pair<int,int>> implementation have an advantage.
I use this implementation for 303C - Minimum Modular (because the package is old, current TL is 1s). 185222734

  • Vote: I like it
  • +67
  • Vote: I do not like it

»
2 years ago, # |
  Vote: I like it +45 Vote: I do not like it

I've seen the latter implementation being used by a lot of Chinese competitors in graph problems (so I tend to call it Chinese adjacency list). It's used in the fastest solutions to a lot of problems on various judges (that involve graph traversals — since DFS is very bad for cache anyway, the vector approach loses its advantage).

One thing that is much faster than vector<vector<int>> is vector<basic_string<int>>, and this is because of short string optimization (SSO). However, due to how basic_string is implemented, if you write complex stuff like pair<int, int> instead of int, it will not compile/give UB. To fix that, you can just define a custom struct without any constructors/initial values (so stuff like template <class T, class U> struct Pair { T a; U b; }; works for "simple types" T and U that are either plain old datatypes or of this form recursively).

  • »
    »
    2 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Is it because fetching data from the adjacency list is usually (if not mostly) more cache-friendly for traversing through the graph?

  • »
    »
    2 years ago, # ^ |
      Vote: I like it +8 Vote: I do not like it

    since DFS is very bad for cache anyway, the vector approach loses its advantage

    I found https://codeforces.net/blog/entry/67001 pretty insightful on how to minimize cache misses

    • »
      »
      »
      2 years ago, # ^ |
      Rev. 3   Vote: I like it 0 Vote: I do not like it

      True, I have used that optimization myself once but this is the first time I have seen it mentioned on codeforces. Thanks for the reference! However it is not applicable to this problem so much, since it needs you to do a DFS anyway (so as to prepare for other potentially costly DFSes).

  • »
    »
    2 years ago, # ^ |
    Rev. 2   Vote: I like it -8 Vote: I do not like it

    Bro, this is scary. I swapped vector<vector> for vector<basic_string> in my template and got double speed on todays E.

    without basic string (296ms): https://codeforces.net/contest/1774/submission/185685343

    with basic string (186ms): https://codeforces.net/contest/1774/submission/185736069

    This is insane

    • »
      »
      »
      2 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      The difference is even more prominent if you make the IO faster. 185739871 with basic_string (109ms) and 185739827 with vector (202ms). This is mostly because for graphs where most vertices have a small degree, multiple allocations are bad.

»
2 years ago, # |
  Vote: I like it +16 Vote: I do not like it

12 years ago it was my typical graph implementation