Блог пользователя BERNARD

Автор BERNARD, 4 года назад, По-английски

Sometimes you might need to use multiple segment trees over some set of elements, for example, you have to answer range queries on a single array from a lot of small arrays or to make a segment tree on a tree to answer path queries.

I knew that in cases like these, it might be more efficient to use a single segment tree to represent everything, for example, instead of building a segment tree on each one of the small arrays, you represent each array as a continuous range in one big segment tree.

I've used this whenever I needed an unknown/big number of segment trees, thinking that this is more efficient and uses less memory, but sometimes it wasn't the case as doing this has made my code slower, I once ended up doing a binary search on every query to find the actual borders of the queries, which I could've avoided by making a separate segment tree on each array.

I've thought about the advantages of using this technique, but I really couldn't find any.

I've found that, in both ways, I'm using the same amount of memory ($$$4\sum n = \sum 4n$$$), and both had the same speed.

So what I want to ask is, when does this technique give me real practical advantages to not use multiple segment trees in those cases? (other than HLD)

  • Проголосовать: нравится
  • +60
  • Проголосовать: не нравится

»
4 года назад, # |
  Проголосовать: нравится +9 Проголосовать: не нравится

I don't think there are any. In fact, using multiple segment trees could be better in terms of performance. Since they are more likely to be cache-friendly.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +7 Проголосовать: не нравится

    I've actually thought the contrary, but now after a bit more thinking... yes, it is indeed more cache-friendly.

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Ok so let's say you have a big segtree of size $$$N$$$, and instead of that you could have had smaller segtrees of size $$$n$$$ each. (Here I have assumed for simplicity that all smaller segtrees are same in size but it doesn't matter)

Anyway... We have $$$n \le N$$$, so obviously, $$$log(n) \le log(N)$$$ (cost of a single update/query). Using multiple segtrees is the asymptotically better option.

Regarding cache-friendly behaviour, I am not very sure about this but it seems like smaller segtrees are more likely to be better but can't explain why.