Aarush's blog

By Aarush, history, 2 months ago, In English

101911E - Painting the Fence

When I first got MLE, I suspected it might be because of #define int long long, but even after changing that I still got MLE. I couldn't figure out what was wrong as I was sure that the space was O(n).

When I used std::set instead of std::deque it used only 100MB memory.

std::deque code
std::set code

Can anybody explain why std::deque has such high memory usage?

  • Vote: I like it
  • +23
  • Vote: I do not like it

»
2 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by ArushN27 (previous revision, new revision, compare).

»
2 months ago, # |
  Vote: I like it +5 Vote: I do not like it

deque isn't stored contiguously in memory

  • »
    »
    2 months ago, # ^ |
      Vote: I like it +8 Vote: I do not like it

    This doesn't affect memory consumption though, only cache utilization.

»
2 months ago, # |
  Vote: I like it +19 Vote: I do not like it

Because its retarded. Deque needs like 2 kilos on its own to prepare for a lot of the shenanigans it can maintain (i.e. random access in constant time). All its subderivates also have the same thing (i.e. queue and stack). So try your best not to use them if you have a lot of them instantiated in parallel.

  • »
    »
    2 months ago, # ^ |
      Vote: I like it +9 Vote: I do not like it

    As a fun fact, you can force stacks to use std::vector by initializing them as stack<int,vector<int>>, but I don't know how this affects performance.

  • »
    »
    2 months ago, # ^ |
      Vote: I like it +37 Vote: I do not like it

    Because its retarded. Deque needs like 2 kilos on its own to prepare for a lot of the shenanigans it can maintain (i.e. random access in constant time)

    Random access in constant time is not the reason for why deque is memory hungry...

    Deque internally allocates data in constant sized blocks. Depending on which implementation of the standard library you use, the block size can vary a lot.

    According to https://devblogs.microsoft.com/oldnewthing/20230810-00/?p=108587, libstdc++ (gcc) uses 512 bytes per block, libc++ (clang) uses 4096 bytes per block, and msvc uses 16 bytes. (I don't know why msvc went with just 16 bytes. It completely kills the performance of its deque.)

    Unlike a vector, deque simply allocates a new block everytime it fills up. This makes it so any pointer/reference to an element in a deque is never invalidated on push_back (this is the main feature of deque!). This feature can be very handy if you are working with pointers. Note also that if you want to store a lot of elements, a deque can be significantly more memory efficient compared to a vector since a deque doesn't have to over allocate by as much.

»
2 months ago, # |
  Vote: I like it +28 Vote: I do not like it

I'm sure a quick Google search would get you to quite a few instances of people crying about deque's memory usage, and some SE answers trying to explain why, so I won't attempt to do that.

As a generally helpful replacement for deque, std::list exists. We tend to largely ignore Linked lists in CP, but the STL implementation is fairly good at basic stuff like popping or inserting at both sides. This fortunately covers almost all uses of deque in most problems, and with less funky behaviour in my experience. It doesn't support O(1) random access though, and it doesn't look like you need it either. Your current code ACs with <100MB using lists.

Less useful: For this specific submission, you can also AC by submitting in C++17, or deleting all deques of size 1 from the map.

Also, your Node has memory leaks. You don't delete l and r. Doesn't fix the issue here though.

  • »
    »
    2 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    It's from kactl, I didn't bother.

  • »
    »
    2 months ago, # ^ |
      Vote: I like it +13 Vote: I do not like it

    Implementing a deque yourself using a circular buffer probably wouldn't be that hard either.

    Rough idea: maintain a buffer of length $$$2^k$$$ as well as a pointer to the first element and the length of the deque. Accessing the $$$i$$$-th element is then done as buf[(i + first) % 1 << k]; pop_back is length--, pop_front is first++; first %= 1 << k;. To push_back and push_front, you can normally write the data to the correct location, but if the length of the deque would exceed $$$2^k$$$, migrate to a buffer of length $$$2^{k + 1}$$$ before (like vectors do).

»
2 months ago, # |
Rev. 2   Vote: I like it +45 Vote: I do not like it

Reminds me of ...

Defining $$$10^6$$$ std::deque in 512MB ML would definitely lead to MLE.

A famous sentence in Chinese programming community.

  • »
    »
    2 months ago, # ^ |
      Vote: I like it +81 Vote: I do not like it

    can you share some other famous sentences in Chinese programming community?

    • »
      »
      »
      2 months ago, # ^ |
        Vote: I like it +2 Vote: I do not like it

      About the queue-optimized Bellman-Ford algorithm (known as "SPFA" in Chinese programming community):It's dead.

  • »
    »
    2 months ago, # ^ |
      Vote: I like it +14 Vote: I do not like it

    Please share more Chinese wisdom