Pieck's blog

By Pieck, history, 3 years ago, In English

CLOSED

Can someone please explain me why this piece of code is taking around 25 secs to run, though the complexity is O(N^2log(N)) where N<=5000.

int n = 5000;
for(int i=0; i<n; i++) {
    set<int> s;
    for(int j=n; j>=1; j--) s.insert(j);
}
  • Vote: I like it
  • +1
  • Vote: I do not like it

| Write comment?
»
3 years ago, # |
Rev. 3   Vote: I like it +10 Vote: I do not like it

N^2 log(N) is big. More than 10^8 operations.

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +4 Vote: I do not like it

    It's only 3e8, in some cases I'd submit that. Seems like the constant is too bad here though.

    • »
      »
      »
      3 years ago, # ^ |
      Rev. 3   Vote: I like it +2 Vote: I do not like it

      It's slow because:

      1. set and unordered_set are inherently slow, probably because they need to pass pointers/references around and have bad cache locality
      2. It's allocating 100MB dynamically. That's a lot of memory, so obviously it takes a while to allocate and deallocate.

      defnotmee's reply below this has a better benchmark than mine. Read that and upvote it!

      Original post

      Useful things from my original benchmark:

      std Library Container Including deallocations Pure runtime
      std::set 21.76s 17.8s
      std::unordered_set 13s 9.76s
      std::priority_queue 7.5s 7.5s

      So an Expert tip: Use sorting (which actually tends to be faster than a priority_queue) or a priority_queue instead of a [unordered] [multi] set if you don't need it, because that could be a significant slowdown to your program.

      • »
        »
        »
        »
        3 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        I was looking for something exactly like this. Thanks for the help.

      • »
        »
        »
        »
        3 years ago, # ^ |
          Vote: I like it +7 Vote: I do not like it

        There are some problems with your benchmark, and specially with "Moving X to outside the outer loop" if you want to test for destructions. With your original code (and not using -O2 because it sometimes cuts part of the code when compiling), that was the output:

        Total time:       13.5812 s
        No deallocations: 12.0908 s
        

        Now this happens when i take set out of the loop:

        Total time:       7.74518 s
        No deallocations: 7.74462 s
        

        This makes no sense, and the reason why that is happening is because the code is basically inserting once and checking whether the element is in the set in the other iterations, which it always is since they are all duplicates, and therefore nothing will be inserted. In fact, here is the benchmark of doing exactly that:

        Code
        Total time:       6.42716 s
        No deallocations: 6.4223 s
        

        Close enough.

        Here is i think a better test for not having to destruct any sets and actually doing what OP wanted (and also not making a bigger set because we arent testing for that):

        Code
        Total time:       13.1702 s
        No deallocations: 13.1694 s
        

        Even if there is a bit of overhead of using the array of sets, i think its pretty safe to say that destructions were a bigger deal than it.

        Here are benchmarks comparing those other data structures with the original test, being taken out of the loop and making an array of them:

        priority_queue
        unordered_set

        All of those were run with this command: g++ help.cpp -std=c++17 -Wall -Wextra -Wfloat-equal -o help; .\help

        Conclusion:

        • Destructing actually consumes some time, but not that much
        • Destructing is particularly negligible with priority queue (i.e vector)
        • Benchmarks are fun to make
        • Go priority queue!
        • »
          »
          »
          »
          »
          3 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          Thank you for this. I guess I have a lot to learn with benchmarking! Would you mind if I edited my comment to say that your benchmark is better?

          • »
            »
            »
            »
            »
            »
            3 years ago, # ^ |
              Vote: I like it +3 Vote: I do not like it

            Of course i wouldnt mind. Btw have a nice day :)

  • »
    »
    3 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Ok my bad, I calculated the operations to be around 3e7 but it comes out to be 3e8 and since i and j are changing and we are already in 1e8 range so it pushed the time to 25 seconds. And I noticed this and thought it was something weird but if it would have happened in 1e7 range then it wouldn't even be noticed. Understood.

»
3 years ago, # |
  Vote: I like it +1 Vote: I do not like it

Complexity analysis can help you predict scaling with problem size, but won't predict absolute times.

»
3 years ago, # |
Rev. 7   Vote: I like it +1 Vote: I do not like it

You have declared your set<int> s inside the for loop doing which you are instructing to deallocate the previously stored set inside that varibale s (which takes O(N) ). To prove this try declaring one unordered_set (which takes O(1) time) once outside the loop and once inside the loop with the same N or you can try by declaring the set outside the loop directly.

  • »
    »
    3 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    So, the set is needed to be implemented inside the loop as before starting the inner loop I need a set with all elements from 1 to n. I have omitted other details of the solution so it might have been confusing. I know my solution pertaining to that problem is not efficient but I want to know why exactly this happens. Another way could be to declare set outside but after the completion of inner loop I'll need to clear the set for my implementation, which won't improve the results as I'm still clearing the set.

  • »
    »
    3 years ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    This is compilation time not run time

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by Pieck (previous revision, new revision, compare).