Segment Tree: from theory to practice

#	User	Rating
1	jiangly	3845
2	tourist	3798
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3589
6	Ormlis	3532
7	Benq	3468
8	Radewoosh	3463
9	ecnerwala	3451
10	Um_nik	3450

#	User	Contrib.
1	cry	165
2	Qingyu	159
2	-is-this-fft-	159
4	atcoder_official	157
5	Dominater069	156
6	adamant	154
7	djm03178	151
8	luogu_official	149
9	awoo	148
10	TheScrasse	145

Theoretical basis

If you have been solving Div1-Div2 problems on Codeforces for a long time, you have probably heard about segment tree. It may sound scary and confusing to some people, but in fact it is a very useful data structure. To put it simply, a segment tree — is a binary tree whose nodes store aggregated data about some segment of an array. Due to this tree structure we can quickly respond to queries about arbitrary array segments, and not only respond, but also update elements.

Imagine that we have an array and we need to repeatedly learn something about the array's sub segments, for example, the sum of elements, minimum, maximum, number of some values, etc. A segment tree allows us to do such queries very quickly — usually in O(log n) per query instead of a trivial O(n) enumeration. At the same time, it is quite flexible, supports changing individual elements, and with additional tricks (lazy propagations) can even mass update entire ranges in the same logarithmic time. In general, a segment tree can optimize a program algorithm quite well when conventional arrays and sorts are no longer sufficient.

Example of a segment tree constructed for the array arr = {1, 3, 5, 7, 9, 11}. Each node stores the sum on the corresponding segment: the root contains 36 (the sum on the whole segment [0;5]), its children are the sums on [0;2] and [3;5] (9 and 27), and so on.

How does it work?

We recursively split the array in half until we get to the individual elements. These elements are — leaves of the tree. Then we start “gluing” the information from bottom to top: each internal node is calculated from the two children. For example, if we need sums, then the value of the node = sum of the left son + sum of the right son. If we need minima — we take the minimum of the two children, etc. As a result, the root stores the aggregate (sum/min/max, etc.) over the entire array, and the path from the root to the leaf corresponds to dividing the segment into sub segments.

When is it necessary?

A segment tree can be used when: 1. You need to efficiently answer queries about subsets of array elements (sum on a segment, maximum on a segment, GCD on a segment, etc.) and individual elements or entire ranges may change between queries, i.e. the data is dynamic. In this case, simple pre-calculation of all answers will not help you 2. There are many queries and updates (tens and hundreds of thousands) and doing them in linear time is too slow

Segment tree provides (log n) per query or update, which is usually within the limits for n and number of queries of about 10^5. At the same time, segment tree memory of ~4*n elements for an array of size n — is also quite ok.

Examples of problems solved by a segment tree

Range sums, finding the minimum/maximum on a segment, the number of certain elements on a segment, checking the ordering of a segment, even finding the k-th element. For example, with the help of a segment tree you can easily find out how many zeros there are in a given range or find the index of the k-th zero in an array. By the way, this is exactly the task we are about to consider in practice.

Practical part

Let's take the problem 2075F - Beautiful Sequence Returns from the recent contest Educational Codeforces Round 176 (Rated for Div. 2) as an example, solving it using a segment tree and drawing an analogy in comparison with a head-on solution.

Let's consider the solution to this problem from wishgoodluck — 311252841. The idea of this algorithm is based on an “event-driven approach”. We precompute two arrays (let's call them l and r) that capture important edges in the original sequence. These arrays help to determine for which segments there is a “change” (an increase or decrease in the contribution to the final answer). Then an event vector is formed — each element of this vector specifies that some value should be added or removed at a certain segment. Finally, the final length of the desired sequence is obtained as the maximum total change.

The key challenge is to quickly and efficiently handle the multitude of events that affect the index ranges. This is exactly what a segment tree is used for.

In this solution, the segment tree is used for two operations:

Range Update. Each event from vector v specifies that either +1 or -1 should be added on the segment [l,r]. A segment tree with lazy updates allows for O(log n) per event to update the entire segment at once, without going through each element separately.
Range Query. After each update we need to quickly find out what is the maximum value on the whole range. The root of the tree (Tree[1].v) always stores the maximum among all elements. This allows us to know instantly (in O(1) after O(log n) updates) the current optimal value, which is then used to compute the answer.

Thus, the segment tree here acts as an “aggregator” of all events, accumulating their influence on the given segments and allowing us to quickly find the maximum.

Imagine a problem where we need to update all elements of a segment at each step for each event. If the range has length m, then the naive update takes O(m) operations. Given m∼n and O(n) events, we get O(n^2) operations. This solution will quickly become time unacceptable even for medium-sized input data.

In contrast, the segmental tree allows whole ranges to be updated in O(log n) due to lazy propagation. Thus, even if there are many events, the overall complexity becomes O(n*log n).

Pitfalls and tips

Segment tree — thing is powerful, but you can make some typical mistakes when implementing it:

Incorrect merging of results. Forget that you need to summarize children, or confuse left/right and the whole logic will be destroyed. Carefully define what the node stores and how it is derived from the children's information.
Segment Boundaries. It's easy to make mistakes with the indices left, right, and middle. Assume that the segment [left..right] is inclusive. Then middle = (left+right)/2 — left part [left..middle], right part [middle+1..right]. If you forget +1 where you need it, you will get overlapping or missing indexes. Tip: debug on a small array, for example, n=4 and write out the segments of each node on a piece of paper, check that they cover the array without gaps and overlaps.
Recursion vs iteration. We considered a recursive solution for ease of understanding, but when n ~ 10^5 the depth of recursion is ~17 is ok, but if n ~ 10^7, the depth is ~24, it's still ok. When n is very large, and especially when the number of operations is large, iterative implementations are sometimes preferred to avoid overhead and risk of stack overflow
Lazy updates. In our example we just used them, updating a whole range at once. Note that implementing lazy tags is more complicated — it's easy to mess with propagating changes to children. If the task requires a segmenter with lazy, test on simple examples hard. Typical mistakes: forgot to run a lazy tag before further descent or before counting in a node — you will get incorrect data. Tip: write a separate push(v) function to propagate the tag from node v to children, and call it where needed, e.g. before moving to subtrees.
Choosing between Fenwick (BIT) and segment tree. They are often interchangeable for simple problems on sums, ordinal statistics. Fenwick is easier to code and requires memory, but the segment tree is more versatile, supporting complex merge operations. If the problem allows Fenwick and you don't want to spend time implementing the tree — go ahead with Fenwick, but segment tree — must know for Div1-Div2 level.

Finally, I'll give you a little hint — if the problem simply requires multiple sum queries on a segment without updates, you don't need a segment tree, it will be enough to calculate prefix sums for O(n) in advance and then answer each query for O(1).

Thanks to those who made it to the end, have a great programming :)

Suggest in the comments what else I should write about in my next post and what topics are still worth exploring.

Comments (7)

Write comment?

Sunb1m

6 hours ago, # |

Auto comment: topic has been translated by Sunb1m (original revision, translated revision, compare)

→ Reply

ryu-chan

Thanks for it being very helpful for me

nr0728

← Rev. 2 →

Thanks for your blog, it's really helpful for who wants to learn Segment Tree.

However, I believe there's a typo in the title, it's Segment Tree instead of Segmented tree.

6 hours ago, # ^ |

agreed

Thank you! Sorry for my english, I don't have enough knowledge of it to write big texts fluently, so I have to use deepl.com translator. Even now when writing this reply I use it :)

No worries at all! Your message is perfectly understandable, and using a translator is a great way to communicate. Your English is already very good, and I appreciate your effort!

Auto comment: topic has been updated by Sunb1m (previous revision, new revision, compare).