Why is the c++ vector so damn efficient ?

#	User	Rating
1	jiangly	4039
2	tourist	3841
3	jqdai0815	3682
4	ksun48	3590
5	ecnerwala	3542
6	Benq	3535
7	orzdevinwang	3526
8	gamegame	3477
9	heuristica	3357
10	Radewoosh	3355

#	User	Contrib.
1	cry	167
2	-is-this-fft-	165
3	atcoder_official	160
3	Um_nik	160
5	djm03178	158
6	Dominater069	156
7	adamant	153
8	luogu_official	151
8	awoo	151
10	TheScrasse	147

Hi. I want to preface this by saying i am a noobie and i totally understand if i am missing something obvious here , but i have seen that the vector object in c++ is way too efficient. What do i mean by way too efficient ? I mean that — even solutions which should be O(N^2) complexity tend to run within 1 second.

An example submission where this is happening —

Example: https://codeforces.net/contest/1883/submission/241050682

Consider the code snippet —

for(int i=0;i<n;i++)
        {
            auto it = lower_bound( a.begin(), a.end(), b[i] );
            it--;
            if(it >= a.begin())
            {
                int val = *it;
                if( val < b[i] )cnt++;
                a.erase(it);
            }
        }

This code snipped from above submission is finding a lower bound in vector A, and then erasing the element. In the worse case, erase should have complexity O(N), which means that the loop overall should have complexity O(N^2)

see this gfg article which https://www.geeksforgeeks.org/vector-erase-and-clear-in-cpp/ where it says clearly that erase is O(N) worst case

Can someone help me understand how the heck its so dang efficient ?

Comments (2)

Write comment?

The-Winner

12 months ago, # |

Indeed it is $$$O(N^2)$$$ worst case, but even in that worst case the complexity is not everything. The compiler might be able to optimize a lot (maybe by using SIMD / somehow paralelizing some of the operations), cache is very well behaved for each removal so that might cut down the time as well. There could be more factors that influence this. I suggest you to try to submit again using a different compiler (or even the same) and see if the time differs. Remember: Time complexity $$$\ne$$$ Time

→ Reply

nor

← Rev. 3 →

+37

Firstly, note that decrementing the begin iterator is undefined behaviour, and it may or may not lead to intended behaviour, so you might want to fix that in your code.

When you look at the assembly for vector.erase, you can find a memmove call (the rest of the work is $$$O(1)$$$). And memmove is heavily optimized for moving large amounts of data (well, for any amount of data, but cache does help here too). This is a case where the time complexity is $$$O(n)$$$ but the constant is very small, so things just pass. This benchmark shows that in the "average" case, when you empty a vector from the middle (and moving around half the elements each iteration), you take about $$$2 \times 10^9$$$ times a no-op — if a no-op in the benchmark above was 1 cycle (this is a big if, though, but reasonable), then on a 2GHz processor, this would pass in 1s.

Coming to your implementation: you sort $$$b$$$ in non-increasing order and $$$a$$$ in non-decreasing order. On average, it is likely that you are removing elements to the right before elements to the left (in the original array). This is intuitively even better than the average case we talked about above. This means that memmove has to do a smaller fraction of the work in the average case (in the best case, it just needs to do $$$O(n)$$$ work). You can probably show that it will not take more than $$$n^2/4 + O(n)$$$ move operations, which is around $$$2.5 \times 10^9$$$ for $$$n = 10^5$$$, and the number of operations (mathematically counted) also coincidentally matches the average case we showed the benchmark for.

saskee1999's blog