Do we actually need lazy propagation on segment trees?

Revision en1, by bicsi, 2019-12-30 10:20:59

There seems to be a lot of encouragement for new people to learn segment trees, and in particular the lazy propagation technique, and it seems to me that most of the time it is not actually needed.

As a quick refresher (although I feel most of you would already know), lazy propagation is a technique in segment trees that lets you do updates on whole ranges of elements, by first only updating the smallest factoring of the update range in the tree. For example, in order to update range $$$[2, 5]$$$, what you do is you update only ranges $$$[2, 2], [3, 4], [5, 5]$$$ in the segment tree, and next time node $$$[3, 4]$$$ is accessed you "propagate" the updates downward into ranges $$$[3, 3]$$$ and $$$[4, 4]$$$. This allows you to effectively aggregate (combine) multiple updates on a given range, to save extra work.

Usually when people talk about lazy propagation, some tasks like "add on range" — "minimum on range" or "add on range" — "sum of range" naturally come to mind. However, I feel like these are poor examples of applications, as most of these can be solved without lazy propagation.

OMG, how is that possible?

An alternative way one could approach these kind of problems is to keep an extra array $$$lazy$$$ with the usual segment tree. $$$lazy(node)$$$ is an aggregate of all the update operations done on $$$node$$$ until present time. In the above examples, $$$lazy(node)$$$ would be the sum of all the updates done on the node. Then the recurrence becomes $$$data(node) = lazy(node) + min(data(node * 2), data(node * 2 + 1))$$$ in the "minimum on range" case and $$$data(node) = data(node * 2) + data(node * 2 + 1) + lazy(node) * length(node)$$$ in the "sum on range" case (here $$$length(node)$$$ denotes the length of the range corresponding to $$$node$$$ in the tree.

The way to query is by having an extra parameter passed down in the traversal, which aggregates the operations inside $$$lazy$$$ while going down in the tree. The technique is similar to something called "upwards-downwards dynamic programming on trees" in Romania.

An example implementation is here.

So, you still store lazy array, but you just don't propagate. Why should we care?

Honestly, firstly I would say that it's more simple and natural. A lot of data structure problems I've encountered need some sort of "lazy" value that holds an aggregate of operations that affect the whole structure (take a simple data structure problem, where you have to model adding a constant value to all elements of a collection, and pop the min element, which can be done by using a priority queue, along with an extra value that lazily aggregates all add operations).

Second of all, I think it's easier to debug a solution that does not propagate, as I feel that I can reason about the correctness of the segment tree values more easily with this approach. In contests like ACM ICPC, it is key to debug on paper as much as possible when applicable, and with this approach one could simulate the range updates and build expectations upon the data structure at different points in time easier.

Second of all, it is way faster. I will probably make a full comparison if people are interested, but from my experience I found that lazy propagation yields a particularly slow solution (hidden constant is bigger), probably because it does a lot more mutations on the underlying array, and this approach reduces that constant very considerably.

Ok, then, why ever propagate?

Well, it seems that not all problems can be solved via this method. For example, a "set to constant value on range" and "sum on range" type problem cannot be easily solved by this. What is particular about these scenarios is that the update operations are order-dependent (in other words, they are not commutative). However, in a lot of cases the update operations are commutative, and I don't see why any of these kind of problems would use lazy propagation instead of this technique.

I'm curious to hear your opinions on this.

Tags #segment tree, #lazy propagation, #dynamic programing, #data structure

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en2 English bicsi 2019-12-30 10:25:21 38
en1 English bicsi 2019-12-30 10:20:59 4106 Initial revision (published)