Monogon's blog

By Monogon, history, 14 hours ago, In English

Recently I've been wondering, why do we always use a branching factor of $$$2$$$ in segment trees, and are there situations where it would actually be better to use a different branching factor?

First, by branching factor I mean the number of children of each node. As we all know, in a segment tree each node represents an interval $$$[l, r]$$$ and splits into two children, representing the intervals $$$[l, m]$$$ and $$$[m+1, r]$$$, respectively, where $$$m=\lfloor(l+r)/2\rfloor$$$.

But what if we instead split each interval into $$$k$$$ intervals, so that each node of the segment tree has $$$k$$$ children? In an extreme example, you can set $$$k=\sqrt n$$$ and this is essentially the Sqrt tree.

In this blog I'd like to explore the analysis of choosing an optimal branching factor, effectively questioning whether we've been doing segment trees wrong all this time.

First Analysis

For simplicity, let's just say that we must support range queries and point updates (no range updates). If our branching factor is $$$k$$$, then the height of the tree is $$$\log_k(n) = \frac{\log n}{\log k}$$$ . For notation, when I write $$$\log$$$ without a base, I mean the natural logarithm (it'll be useful for taking derivatives).

Point Updates

Consider a point update. We need to process $$$\frac{\log n}{\log k}$$$ nodes on the path from the root to the leaf, and to update each node we must combine the values of its $$$k$$$ children. Therefore, the cost of a point update is $$$O(\frac{k \log n}{\log k}).$$$

Range Queries

Now, consider a range query. We can prove that a range can be split up into most $$$O(\frac{k \log n}{\log k})$$$ subranges, each represented by one of the segment tree nodes. In fact, in the optimal split there cannot be $$$2k$$$ nodes in the same layer: that would mean there would be $$$k$$$ consecutive siblings, and we could replace them with their parent node.

Optimizing

Since point updates have the same complexity, there is no need to consider how many queries there are of each type, and we can simply find the value of $$$k$$$ that minimizes $$$\frac{k \log n}{\log k}$$$. We can remove the $$$\log n$$$ factor which does not depend on $$$k$$$, and only optimize $$$f(k) = \frac{k}{\log k}$$$. With some calculus, we can take the derivative and set it to zero:

$$$f'(k)=\frac{1}{\log k}-\frac{1}{\log^2 k}=0$$$

Solving this yields

$$$\log k = 1,\quad k=e$$$

Nice, so the optimal branching factor is $$$e$$$. Since we can't split into a fractional number of nodes, let's round that down to the nearest integer... oh wait that's just $$$2$$$ which is what we're already using. I guess you could round up to $$$3$$$ instead, but the constant factor is probably not worth it given that computers really like powers of $$$2$$$.

A Stubborn Second Analysis

I'm pretty disappointed to find that the first analysis normalizes the status quo, so I'm going to be stubborn and make an additional assumption to spice up the analysis.

Remember when I said that in the point update operation, we need to update each node's value by combining its $$$k$$$ children? Well, what if, like, we just didn't do that? Let me give an example of what I mean. In many segment tree operations, it's easy to apply an update to each of its parents directly without combining all its children again. For example, if the operation is addition and we change a leaf's value, then all of its parents need to be adjusted by the same amount. Or if the operation is $$$\min$$$ and the point update is to decrease the value of a leaf, then for each of its parents we can simply decrease them if needed.

In this case, we would have a complexity of $$$O(\frac{\log n}{\log k})$$$ for each point update and $$$O(\frac{k \log n}{\log k})$$$ for each range query. In the case where there are much more point updates than range queries, we may want to choose $$$k$$$ to properly balance the complexity of the two operations.

So let's do just that. Say there are $$$m$$$ point updates and $$$q$$$ range queries. Our total complexity that we wish to minimize is:

$$$\frac{m \log n + qk\log n}{\log k}$$$

Let's remove the factor of $$$\log n$$$ since it doesn't depend on $$$k$$$:

$$$\frac{m+qk}{\log k}$$$

Again, to find the optimal value of $$$k$$$, let's take the derivative with respect to $$$k$$$ and set it to zero.

$$$\frac{q}{\log k}-\frac{(m+qk)}{k\log^2 k}=0$$$

After simplifying, we get:

$$$k(\log k-1)=m/q$$$

If $$$m/q$$$ is sufficiently large, then we can see that $$$(m/q)^{1/2}\le k\le (m/q)$$$, and therefore $$$(\log k-1)=\Theta(\log (m/q))$$$. Then, we have

$$$k=\Theta\left(\frac{m/q}{\log (m/q)}\right).$$$

Finally, the total complexity of all segment tree operations is

$$$\frac{m\log n}{\log (m/q)}+\frac{m\log n}{\log^2 (m/q)}=O\left(\frac{m\log n}{\log (m/q)}\right)$$$

Examples

Let's see some example values of $$$m$$$ and $$$q$$$, to see if this complexity actually gives us any benefit, and to get a sense of what the value of $$$k$$$ should be.

First, if $$$m=O(q)$$$, then we get a complexity of $$$O(m\log n)$$$ which is exactly what we already get with a normal segment tree. Boring!

Next, what if $$$m=O(q\log n)$$$? In this case, we get a complexity of $$$O(\frac{m \log n}{\log \log n})$$$ which is actually better than a normal segment tree! In this case, the optimal $$$k$$$ is $$$\Theta(\frac{\log n}{\log \log n})$$$. If $$$n$$$ was $$$10^6$$$ then $$$k$$$ would be around $$$5$$$, which I guess is a bit spicier than the $$$3$$$ from our first analysis. If we wanted to use a power of $$$2$$$ for constant factor reasons, then a branching factor of $$$4$$$ suddenly seems to be a worthy contender to the usual $$$2$$$.

Conclusion

In conclusion, this might have just been a lot of useless math with a disappointing result. We found that the branching factor should just be a small single digit number like we've already been doing all this time. For the very niche case when there are way more point updates than range queries, it might be worth considering a larger branching factor like $$$4$$$, but it probably still boils down to the practical implementation details and constant factors.

Also, I doubt I'm the first person to do this kind of analysis, so there may be more interesting theoretical results I'm not aware of that would apply to this problem.

Thanks to chromate00 for proof-reading!

  • Vote: I like it
  • +88
  • Vote: I do not like it

»
11 hours ago, # |
  Vote: I like it +14 Vote: I do not like it

We can actually extend this idea to not just having multiple branches, but storing multiple values at each node (basically just a B-tree). The data structure becomes faster that way not just because of the smaller height, but for cache reasons. Usually, a cache line contains 64 bytes, so the memory is divided into 64-byte blocks. If we want to access a single value, the entire 64-byte block is loaded into the cache, so why not reuse some of those bytes instead of overwriting the entire cache? This blog article explains that idea pretty well in my opinion. It would be interesting to see a comparison side-by-side between the branching values that this blog claims ( $$$k \in { 3, 4, 5 }$$$ ) and the values that were used in the other article ( $$$k \in { 8, 16, 32, 64 } $$$ ).

»
3 hours ago, # |
  Vote: I like it 0 Vote: I do not like it

L rizz

»
82 minutes ago, # |
  Vote: I like it 0 Vote: I do not like it

What happens if different nodes can have a different amount of children? Should we have some nodes with $$$2$$$ children and other nodes with $$$3$$$?