Hello, codeforces!
Long time no see, right? So maybe it's a good idea to try to return to my blogs slowly. This time the blog will be about a trick, which usually isn't necessary to solve a task, but can be useful to make implementation much more comfortable.
Let's look at this problem. It is about some DP on a tree in which we have to use convex hull trick to improve the complexity. The task requires merging two convex hulls with "smaller to bigger" trick. I recommend you to read the statement before reading the rest of the blog (and the editorial if you don't know how to solve it). $$$ $$$
So, how to merge two convex hulls? If we keep them in something like the set from C++, then we can just take every element from the smaller one and try to add it to the bigger one. Unfortunately, adding one element in this way requires some ifology and care. It's hard to implement it, especially on individual contests where you don't have a reference document and cannot use the internet (like IOI-style contests).
We know that adding one element is hard, right? So, what is an easier way to get the new convex hull? Definitely, the easiest way is to collect all the points, sort them, and calculate the convex hull with the standard algorithm.
But we still have to care about the complexity. So, here comes mnbvmar with his great approach. I'm very grateful for sharing it with me. Instead of one convex hull in each node, let's split points from that node into $$$\log(n)$$$ groups and maintain the convex hull of each of them! How to split them? I'll tell about it in a moment. Why will it work? In this task, each node corresponds to a single point on the plane (or a linear function if you think about lines and not about points, but I'm going to stay with points). We need to answer multiple queries where we're given a vector and a subtree, and we need to minimize the dot product of the vector and some point corresponding to one of the nodes in this subtree. Normally, it would be done using a binary/ternary search on a convex hull, so we can simply do it on each of the hulls separately and return the minimum of the results.
For example, if we have to keep a convex hull of, let's say, $$$11$$$ points, let's split them in any way into groups of $$$1$$$, $$$2$$$ and $$$8$$$ points (yep, it's the binary representation of $$$11$$$). Now, let's say that we want to merge convex hulls, where one of them keeps $$$11$$$ ($$$1+2+8$$$) points and the second one keeps $$$36$$$ ($$$4+32$$$) points. The sizes of all groups are different, so we can straightforwardly make a new structure which consists of $$$5$$$ convex hulls of sizes $$$1$$$, $$$2$$$, $$$4$$$, $$$8$$$ and $$$32$$$.
But, what if we want to merge convex hulls of $$$35$$$ ($$$1+2+32$$$) and $$$38$$$ ($$$2+4+32$$$) points? If we allowed having two groups of equal sizes, we could quickly end with so many groups of this same size. As we still care about the complexity and want to have very few convex hulls in each node (node of the tree, recalling to the problem statement), we should change the sizes in some way. We have two groups of size $$$2$$$, so maybe let's try to combine them into one group of size $$$4$$$? By combining, I mean using the standard convex hull algorithm on points from these two hulls to make one hull. Now we have groups of sizes $$$1$$$, $$$4$$$, $$$4$$$, $$$32$$$, $$$32$$$. Let's now combine groups of size $$$4$$$. Poof, $$$1$$$, $$$8$$$, $$$32$$$, $$$32$$$. Now it's time to merge the groups of size $$$32$$$. The sizes are $$$1$$$, $$$8$$$, and $$$64$$$ now, which already looks nice. We had to do some recalculations, but maybe the situation isn't so bad. Of course, the complexity of calculating new convex hull is linear (or has an additional logarithm if you want to sort the points). So each time when we touch a point (we recalculate a convex hull containing it) it adds $$$O(1)$$$ (or $$$O(\log(n)$$$) if you're sorting points) to complexity. How many times can we touch a point? Whenever we do it, the size of its convex hull doubles, so we can do it only $$$O(\log(n))$$$ times!
You might be confused with these sizes because some points become useless as soon as they stop belonging to a convex hull. So yes, it's true that these sizes might not be exact powers of two, but they could be less. It isn't important. We can remember their original sizes and assume that these useless points are still there.
The complexity of recalculations is $$$O(n \cdot \log(n))$$$ (or $$$O(n \cdot \log^2(n))$$$) in total. On the other side, we have to do a binary/ternary search $$$O(\log(n))$$$ times in each node, instead of doing only one, so the final complexity will be $$$O(n \cdot \log^2(n))$$$ in overall, which makes this solution as fast as the model solution.
Nice, isn't it? Vectors instead of sets (using C++ slang), easier code, fewer places for off-by-one errors. "But Radewoosh, this problem is still solvable in a usual way, the trick is almost useless". Yep, but the convex hulls are only an example. Let's imagine the following task (which was on some Open Cup a long time ago if I'm not mistaken, but I'm unable to find it). We have to keep a multiset of strings, which is initially empty and perform a sequence of operations. In each of them, a new string comes, let's call it $$$s$$$. We have to calculate the total number of occurrences of the strings from the set as substrings of $$$s$$$. If some string from the set has multiple occurrences (it matches with substrings of $$$s$$$ in many positions), we have to count each of them separately. We should then print the calculated number and add $$$s$$$ to the set. We should aim for $$$O(m \cdot \log(m))$$$, where $$$m$$$ is the size of the input. A standard problem with strings.
Now the hard part: we should process queries online.
Without the online assumption, we could build an Aho-Corasick on all the words from the input and use some segment/Fenwick tree to make the new strings "active". The task is kind of well-known, so I won't describe how to exactly solve it. But we need to process queries online, and it's hard to update the Aho-Corasick tree and add new strings to it. The only easy thing is calculating a new one, with a new set of strings. Any ideas? Yes! Let's keep $$$\log(number\ of\ strings)$$$ tries instead of one and recalculate them whenever you need to do it. Also, we don't need any segment trees. Everything that we want is a sum of some values in the subtrees, so we can use a simple DP to calculate them.
That's all. I hope that you found this blog useful and that the trick will save some of you somewhen. I also hope that the blogs will again appear more often. If you have any questions or you find more applications for this trick, write about it in the comments. See you next time! :D
The trick is really cool.
But it was not in Open Cup (but maybe it was), it was in one of the first educational rounds, in the time when they were educational.
710F - String Set Queries
Also during this year, this trick was even presented on one of the courses about Data science in my university. So it has some applications outside of icpc contests.
It was on OpenCup or some other ream competition on ejudge, I definitely remember seeing it there, but anyway thanks for providing link to submitting this problem on CF!
I indeed saw it on some Asian contest before proposing it to educational round. Can't remember which one exactly unfortunately. And the earliest appearance of this problem on codeforces I can find is here
When you realise that errichto has just surpassed you in contributions :p
tied*
He messaged me to write Blogewoosh, so I just did it. :P
Also, I’m not sure if we’re really competing. I really admire things which Kamil does. He does a lot for the community, but we’re definitely aiming for different competitors, my blogs contain more advanced topics, so it’s hard to compare things that we write.
And it seem like Mike supports him
Part 4 here. Actually, this idea is very old, I think I heard it even before that.
Decomposable searching problems I. Static-to-dynamic transformation. I learned it here.
Actually, this trick is used in Binomial Heaps. So, it is at least as old as 1978 (when binomial heaps were invented).
For me it is one of the tricks I used outside of competitive the most. To give an example, I found many APIs that implement a certain data structure which supports insert of one element in
O(I)
and chooses to expose the merge operation only inO((N1 + N2) * I)
, as opposed toO(min(N1, N2) * I)
(and usually there as some technicalities as to why you should not domin(N1, N2)
inserts yourself).On top of my head, I remember having to merge
K
pandas dataframes in a Data Science project (and only found out about concat later). The other time I wanted to build a bamboo tree ofN / 2
and have it so that a leaf is added to each one of these nodes, using jngen's tree link, which is alsoO(N1 + N2)
(and actually I still haven't figured out a way to do this easier).Great lecture from Burunduk1 (Russian, sorry):
https://youtu.be/f6L8ZucH-0M?t=294
You can implement it in a cute way using bit tricks (similar to binary indexed tree).
The key idea is to keep all elements in one vector, and if the number of elements has binary representation $$$2^a + 2^b + 2^c + \dots$$$ with $$$a > b > c > \dots$$$, the first $$$2^a$$$ elements of the vector are processed into one static collection, the next $$$2^b$$$ go in another, and so on.
When inserting elements or when combining two sets of elements we can know with basic bit manipulations which collections should be rebuilt and which ranges of the element vector they should contain.
Really nice trick, thanks for sharing. However, I have to point out, you don't need it in this problem. You don't even need convex hull trick.
Yep, you read correctly. No convex hull trick whatsoever. A simple dfs + dp is enough, thanks to a brilliant cache-handling trick discovered by Golovanov399. Since you can use dfs traversal to arrange the nodes such that every subtree is a contiguous subsegment, a bruteforce $$$O(n^2)$$$ passes due to cache miss minimization.
You can find a deeper discussion on this trick in this blog. It talks about a slightly different problem which uses HLD, but the idea is the same.
AC'ed code: 56062448
I'm not against $$$O(n\cdot\log^2(n))$$$ problems. But I believe setters/testers should be careful with them. Normally, they should pass in $$$1$$$ or $$$1.5$$$ seconds, since in a $$$10^5$$$ limit problem, they should do around $$$2 \cdot 10^8$$$ operations. An example on the top of my head is a binary search over a Fenwick tree. However, it is controversial when the expected solution is $$$O(n\cdot\log^2(n))$$$ and it involves heavy constants. In this specific problem, the convex hulls are stored in a set/multiset, and we are doing $$$n\cdot\log(n)$$$ inserting operations, which are not lightweight at all. The solution's runtime grows drastically, since we start approaching $$$10^9$$$ — $$$10^{10}$$$ operations. In consequence, this processor-friendly $$$O(n^2)$$$ is just as as fast.