A. Meaning Mean
Author: athin
Developer: ArvinCiu
Editorial: Kinon
Tutorial
B. Maximize Mex
Author: joelgun14
Developer: joelgun14, CyberSleeper
Editorial: Pyqe
Tutorial
C. Adjust The Presentation
Author: ArvinCiu
Developer: icebear30, gansixeneh
Editorial: Pyqe
Tutorial
D. Boss, Thirsty
Author: ArvinCiu
Developer: CyberSleeper
Editorial: Pyqe
Tutorial
E. Digital Village
Author: CyberSleeper
Developer: ArvinCiu
Editorial: Pyqe
Tutorial
Brute force in $$$\mathcal{O}(nm^2)$$$ can pass problem D.
Submission
Hacked :)
Damn
:)
Segment tree for C2 seems like a bit of overkill. You can keep vector of sets of indices in b for each element of a. (including m+i, or just any large const, if you allow non-strict increasing arrays instead). Then, for each update you just need to check whether the increase condition still holds for the adjacent pairs and update the counter. If the counter is full, yield YA.
I passed C2 with a priority queue
FINALLY.
Ha-ha, not for E yet:)
My solution to E3:
Call a node "special" if it is one of the $$$p$$$ given nodes. Call a node "chosen" if it is one of the $$$k$$$ nodes in the chosen subset.
Build the Kruskal Reconstruction Tree (KRT) of the graph, with $$$2n - 1$$$ nodes. All nodes in the original graph are given the same label in the KRT.
For some special node $$$x$$$, let $$$y$$$ be the lowest ancestor of $$$x$$$ in the KRT with at least one chosen node in the subtree of $$$y$$$. The maximum edge weight needed to get to a chosen node from special node $$$x$$$ in the original graph, is the weight of the edge that corresponds to $$$y$$$ in the KRT.
I then transformed this problem a bit. Let $$$f(i)$$$ be the edge weight that corresponds to node $$$i$$$ in the KRT (if $$$i \leq n$$$, then $$$f(i) = 0$$$). Let $$$g(i)$$$ be the number of special leaves in the subtree of $$$i$$$ in the KRT. Let the the value of a node $$$i$$$ be $$$g(i) * (f(i) - f(par[i]))$$$. Choose $$$k$$$ leaves in the KRT, to minimize the sum of values of nodes which have at least one chosen leaf in its subtree.
This works, because the contribution of some special node $$$x$$$ will be $$$(f(y) - f(par[y])) + (f(par[y]) - f(par[par[y]])) + \dots = f(y)$$$, where $$$y$$$ is the lowest ancestor of $$$x$$$ with a special node in its subtree.
Construct the subset of leaves one by one. Keep choosing the leaf which will have the least contribution.
To simulate this quickly, let $$$h(i)$$$ be the highest ancestor of leaf $$$i$$$, such that leaf $$$i$$$ is the leaf that minimizes the sum of values on the path from $$$i$$$ to $$$h(i)$$$. Initially, you must take the leaf $$$i$$$ which has $$$h(i)$$$ = the root of the tree. Sort the rest of the leaves in increasing order of the sum of values on the path from $$$i$$$ to $$$h(i)$$$, and add them to the subset in that order.
Let $$$dp[i][j]$$$ be the minimum sum of values of nodes when choosing $$$j$$$ leaves in the subtree of $$$i$$$.
If $$$j = 0$$$, then $$$dp[i][j] = 0$$$. Otherwise, $$$dp[i][j] = val[i] + min(dp[left][x] + dp[right][j - x])$$$, where $$$0 \leq x \leq j$$$, and $$$left$$$ and $$$right$$$ are the children of $$$i$$$ in the KRT. To optimize this DP for E3, notice that the values of $$$dp[i]$$$ are convex. $$$dp[i][j] - dp[i][j - 1] \leq dp[i][j + 1] - dp[i][j]$$$ holds.
Doing the transformation $$$c[j] = min(a[x] + b[j - x])$$$ when arrays $$$a$$$ and $$$b$$$ are convex, can be done by merging the slopes of arrays $$$a$$$ and $$$b$$$. (See this errorgorn blog, (max, +) convolution part).
You can maintain the slopes of the dp in a sorted multiset, and merge the multisets of the left and right children into $$$dp[node]$$$, and then insert $$$val[node]$$$ into $$$dp[node]$$$.
The moment I read the problem I instantly saw that it would be DP or greedy on KRT lol
Is KRT such a known topic? Honestly, I've never heard of it before reading this explanation. From what I've read it's used to compute the biggest edge on a path between two nodes of a tree in O(1), which is super cool but until now I've never had to do it
I learned it when I was training for Balkan OI, I don't think its such a known topic but its very cool
Alternate much simpler solutio : split over largest edge and solve the 2 subproblems and then merge
This is O(n^2) dp [each pair of nodes contributes exactly once]
To optimize, do same as the dp solution, convex => store slopes in multiset and small to large
Isn't this the same as my solution? I just preprocess the part of splitting over largest edge
well you dont need any KRT. also do you have a proof for your greedy solution? WHy is the optimal subset for i + 1 necessarily a superset of subset for i
Assume the optimal subset for a value of $$$k$$$ is not a prefix of the sorted list of leaves. Let $$$x$$$ be the first leaf in the sorted list not chosen.
If $$$h(x)$$$ has no chosen leaf in its subtree, replacing $$$x$$$ with any chosen leaf after it in the sorted order is obviously optimal.
Otherwise, let $$$y$$$ be some chosen leaf that comes after $$$x$$$ in the sorted order, such that $$$lca(x, y)$$$ is lowest possible. Let $$$z$$$ be $$$lca(x, y)$$$. Notice that $$$h(x)$$$ must be an ancestor of $$$h(y)$$$, as we said that $$$x$$$ is the first leaf not chosen. $$$z$$$ must also be a strict ancestor of $$$h(y)$$$. This means that the sum of values from $$$x$$$ to $$$z$$$ must be at least as small as the sum of values from $$$y$$$ to $$$z$$$, so choosing $$$x$$$ instead of $$$y$$$ is optimal.
The last paragraph is quite unclear to me, maybe because I am quite unfamiliar with it(convex,slopes in this problem's context), but how does it work here exactly.
the comment i replied to had already linked errorgorns blog about it.
Nevertheless, suppose you want to compute array c where c[k] = min(a[i] + b[j]) such that i + j = k, and you know both a and b are convex, then you can prove that if you consider the adjacent differences of a, and the adjacent differences of b, and merge them together, you get the necessary adjacent differences of c.
from here, you can store the dp states as multisets of adjacent differences, and then merge the multisets using small to large to maintain nlog^2 complexity. the largest element needs to be kept track of separately fyi. https://codeforces.net/contest/2021/submission/284859127
wish to reach this level someday
Bro can you please tell what approach did you use in E2 or what that technique is called?
During the contest I can only come up with O(n^2) DP. Seems I didn't realize KRT is a binary tree and didn't come up with slope storing. BTW, the Greedy is such a wonderful solution! This indicates that we can also consider every node's contribution instead of pair of leaves. It gives me a huge inspiration.
After this round my rating increased to 1399, and i am very grateful to the rating system.
You will probably get +1 on the next rating rollback
why +1 ?
Well cheaters exist, their solutions are getting skipped and after some time they recalculate the ratings i.e. rating rollback, so he will for sure get at least +1 rating so he gets specialist
E is very nice. Learn a new trick about how to use minimum spanning tree in this.
Btw, editorial E only E3 ?!
For E1 you can easily do floyd-warshall to compute all the distances in O(N^3). Then you can find the solution for all Ks choosing where it's most optimal to place the next server, knowing that the K-1 servers chosen before stay the same.
For E2 you can do a DP, merging solutions for subgraphs (connected components) by using a DSU and processing each edge in order of weight. Merging two subgraphs A and B, it's optimal for the houses that need wifi to stay connected to the servers in their own component, so you simply sum the total latencies of both subgraphs. If one of the subgraphs has no servers (A) and do other does (B) then all the houses in A that need wifi need to reach the servers in B, passing through the edge currently being processed, which is the heaviest, so you sum the total latency of B with the weight of the current edge multiplied by the number of houses in A that need wifi.
Here are my solutions: E1 : https://codeforces.net/contest/2021/submission/284593507 E2 : https://codeforces.net/contest/2021/submission/284974748 (I learned from tourist's submission and added some comments)
How do you prove that K-1 servers stay the same? I mean, you don't need to prove it to submit and get AC, but you need to convince yourself that it worth putting in the effort.
I'm not sure if this is correct but it can be done through something like this. what we essentially need is that every person that requires a wifi should be given the connection by choosing K servers optimally, now let's say we have the answer that when we choose some K-1 servers optimally, what every node in need of wifi's distance is to those servers. We can use this answer to get our next one, as let's say the current distance(from the k-1 servers) is stored in dist, now everytime we try to include a node as a server every point in dist array would change like min(dist[i],new_server[i]) where new_server[i] is the distance of the new server to the point i, and as minimum is a decreasing function it should stay the same.
How do you prove the conclusion of E1 by theory but not your submission?This conclusion is important.
How is K — 1 fact true ? I think I'm able to generate a counter example.
i have same question
I was gonna say D is chaotic until I saw this implementation (ecnerwala's). Insane
athin Kinon "It turns out that the strategy above is also the optimal strategy for the original version of the problem. So we can simulate that process to get the answer" proof please?
It was said in the editorial:
"If we do this, $$$A_n$$$ contributes $$$\frac{1}{2}$$$ times, $$$A_{n−1}$$$ contributes $$$\frac{1}{4}$$$ times, $$$A_{n−2}$$$ contributes $$$\frac{1}{8}$$$ times, and so on. This is the optimal way to get the maximum final result."
It is clear that we want the array sorted and it is optimal because the bigger numbers always contributes more.
Explain to me, plz, algorithm for E1. Most of the solutions use Floyd-Warshall algorithm to calculate distances between vertices. Then for each k they greedily find next vertex to install server, and this vertex should reduce current latency the most. So if we have set of vertices-servers on the step k, then solution for k+1 contains all these vertices plus another one that have met the criteria I've described above. While it's clear for me that we should install servers in vertices where we need servers (and do not consider other vertices as candidates), it's more complicated to prove optimality of greediness on each step for k. Can you explain me, why greedy solution would be globally optimal?
For example, for k=1 I have to put server in vertex, say, 3. Why it's not possible for k=2 to have set of vertices (1, 2) as optimal solution so that latency of (1, 2) solution is less than (3, 1) or (3, 2) or any other (3, *) solutions?
It felt intuitive for me but only after implementing it. As you said "this vertex should reduce current latency the most". The target of what we are greedily minimizing is the sum of minimum current latencies considering servers chosen so far. So let's try the following:
For 1 server, to find minimum total latency, we can choose greedily the house where the sum of latencies after choosing that house is the minimum. Seems obviously true.
Assume for some k servers placed, we have chosen k servers that reduce total latency (sum of current latencies for each internet house) the most and this is optimal. Then for k+1 servers, lets choose from the remaining houses the house that reduces the total latency the most. Had we picked some other house, then the total latency would have been greater or equal.
So it should be true for all k by induction. Feel free to correct if this is wrong.
The first thing to do when proving a greedy algorithm by induction is to make sure that the decision made on the first step doesn't block the path to the optimal solution.
In your reasoning you don't check this condition. And the example, that I gave in the previous comment, shows that. You greedily install server in vertex 3 because it reduces the current latency the most. But it's not clear why this step doesn't block the way to the solution on step k=2 where set of vertices (1, 2) could be better and have less latency than any other (3, *).
To finally clarify what I mean, let's consider a greedy approach to the knapsack problem. You have knapsack with capacity 6, indivisible (so you can't divide object into parts) items with weights 2, 2, 2, 5 and their costs 4, 4, 4, 5 accordingly. Greedy algorithm would take item with weight 5 because it has the highest cost and give us non-optimal solution with the cost of 5 while dp would find optimal solution putting first three items in knapsack and making a cost equal 12. Let's look at what we'd get if we try to prove that our greediness gives us optimal solution using induction. Target is clear: we should maximize the total cost of the items in the knapsack. First, we choose the item with the highest cost (similar to how you choose where to install first server) because it increases cost the most (you see, exactly like you did in your proof). We put him in the bag, and it turns out that we would never get the correct answer because none of the left items fit in our knapsack anymore. That's because our first step blocks optimal solution, and we should check this before continuing the proof by induction.
I hope I was clear. Also, if you see any mistakes, be sure to let me know. And thanks for your answer.
I think the difference is we don't have any cost constraint on what to choose, unlike in the knapsack case. So we can keep choosing. Another way of thinking is instead of blocking the path to the optimal solution, what if we show what we found is already optimal. That's what I went with. Also, we can definitely have different vertices being optimal, but only with equality, never with strict inequality.
But we have a constraint of k which is the maximal number of servers we can have. And you should consider this constraint as the knapsack's capacity. So if you install the first server and use it till the end then you take up space in your "backpack".
And, of course, (3, 1) and (1, 2) could have equal latency and both be optimal but it might be the case that (1, 2) latency is strictly less than (3, 1) latency. So it should be proved that taking 3 doesn't block the way for the (1, 2) in this theoretical scenario.
And the same goes for your another approach when we consider what we found as already optimal. Taking item of cost 5 is already optimal on the current step but it is not optimal for the whole problem. And it's all because of blocking optimal solution.
Okay, I had another idea to try. What if we can show that the rows of the distance matrix after Floyd-Warshall is a matroid, by showing that the row vectors are linearly independent. We know it's symmetric with all diagonal elements 0. Not entirely sure if this line of investigation is good, but if it was possible to show them linearly independent then the definition of matroid would take care of greedy being optimal.
I have the same question as you
Please make Editorial for each part of the problem E.
Both solutions for C2 had a level of math observation way beyond me, I solved C1 almost exactly mentioned in the editorial yet failed to even come close to the required idea for C2. Anyway segtree seemed the more approachable of the two so here's a segtree submission which is relatively clean for reference: https://codeforces.net/contest/2021/submission/285002624
Super slow editorial :(
I waited for days to see the official editorial for E3 and what I see now is
coming soon
. The time is enough to see the accepted code by myself and understand the idea of KRT.Editorial for E1&E2:
Firstly, it is not hard to find out that the graph can be reduced to its minimum spanning tree since only the max value on the path is concerned.
Now, suppose the tree $$$T$$$ is in the form of $$$A-e-B$$$, where A and B are two subtrees and $$$e$$$ is the edge of the large value in the graph. Define $$$DP_A[i]$$$ where $$$(1\le i \le n))$$$ as the sum of latencies in A if we place $$$i$$$ servers there, similarly for $$$DP_B$$$ and $$$DP_{T}$$$.
To calculate $$$DP_{T}[i]$$$(which means the total latency given $$$i$$$ servers can be installed in $$$T = A-e-B$$$), notice that there are exactly three scearios:
where $$$size()$$$ means the number of houses requiring the internet.
Now we can do the calculation recursively.(Actually you will do it bottom up)
Editorial for E3:
TLDR: The DP array actually forms a convex hull, which is intuitive since the marginal benefit when you installing a new server is decreasing. Hence when calculating the transitions, the so-called Minkowski addition can be applied to accelerate.
I can't thank you enough for the editorial!
I dont think finding the corresponding MST is even neccesary for this problem, you can omit that part and use the basic greedy idea of kruskal to get the solution for E2
Thank you for editorial, but how we can solve E2? I have a problem with calculating DPt. I can only do it in O(n^2) (cycle for i and for j). And we have to calculate it for each tree.
Sorry I didn't notice that $$$O(n^2k)$$$ couldn't pass E2. Actually with the same idea you can achieve better complexity if you just stop the loop earlier. This is to say, if there are $$$x$$$ in the subtree, you actually only need to place at most $$$x$$$ servers there. Now the complexity is $$$O(n^2)$$$ which should be able to pass E2.
I followed your instructions, set i <= x and min(1, i-size(B)) <= j <= min(i, size(A)) (i-j must be lower than size(B), but it failed to achive O(n^2) complexity.
Did you do this in your solution?
Sorry I directly went for E3. Here is a solution I just wrote for you. https://codeforces.net/contest/2021/submission/286859908
Thank you, DSU works faster. Maybe i have a big constant, because complexity is the same.
I tried your technique and it passed E2 but I still don't understand the math behind it. Why would the time complexity be only O(n^2)?
It's a standard trick used when optimizing tree dp when size is part of the state. It's $$$O(n^2)$$$ because there is a bijection between the number of operations and the number of pairs of nodes(which is $$${n}\choose{2}$$$ apparently. This is true because each pair of nodes is only counted once at their LCA.
I understand. Thanks a lot.
I was having trouble understanding this until I came across this USACO guide blog.
The key is that if we merge two disconnected subtrees of size $$$a$$$ and $$$b$$$ respectively in $$$O(ab)$$$ time until we form a complete tree, then the overall complexity is $$$O(N^2)$$$. This is valid here because $$$DP_A$$$ will have $$$size(A)$$$ states, $$$DP_B$$$ will have $$$size(B)$$$ states, and when we form $$$DP_T$$$ with $$$size(T) = size(A)+size(B)$$$ states, it will require only $$$size(A)\cdot size(B)$$$ operations if we ensure that $$$1\leq j\leq size(A)$$$ and $$$1\leq i-j\leq size(B)$$$, which translates to $$$2\leq i\leq size(T)$$$ and $$$max(1, i-size(B))\leq j\leq min(i-1, size(A))$$$ for ease of writing loops, as mentioned by ColobocCodeforces in an above comment.
Thanks a lot for your solution btw, it's very clear and concise.
Is there proof as to why the time complexity for the first solution is $$$O(N^2)$$$? It seems to me like there are $$$N^2$$$ states ($$$DP_{A}[i]$$$ for $$$A \leq N$$$ and $$$i \leq N$$$) and for each state the transition seems to be $$$O(N)$$$ (3rd scenario refers to $$$N$$$ other states) which leads me to think that the time complexity is $$$O(N^3)$$$.
Yes you are right. To acheive $$$O(n^2)$$$ a standard trick is needed, please checkout https://codeforces.net/blog/entry/134873?#comment-1209606
Problem C2 can be solved using heap to store the indexes for each elements of a.
For each update, we update the heap and remove first invalids value of the heap (if the index is i, check if b[i]==value)
We also use counter or set to keep track of adjacents pair with increasing condition
Can anyone provide proof for Problem A?
For C2, that is a very cool way to set up a segment tree. It is definitely more clean than all the sortedlist solutions.
Why E3 is still coming? Write the editorial, lazy author CyberSleeper and developer ArvinCiu!
Can anyone please provide an implementation of C2 using segment trees?
https://codeforces.net/blog/entry/134873?#comment-1206823
This one is pretty cool.
Also explore this idea https://codeforces.net/blog/entry/134873?#comment-1206675. It's more beneficial to teach yourself to see simple solutions.
Upd: here is the implementation for the simple approach without segment tree https://codeforces.net/blog/entry/134759?#comment-1206731.
can someone help me whats wrong with this code 285212221 in problem B
3 days and still no editorial for E...
also it would be unfriendly to low tier contestants if there is only solution for E3
It seems for C2, segment solution has been removed. I believe relabelling of numbers step would remain same there. And segment tree will essentially confirm if first occurrence of each element is in ascending order or not.
But I needed some help/intuition on how to formulate this into a segment tree.
The relabelling step changes any permutation into 1..n.
Then what we want in the array b is that the first appearance of any i is after the first appearance of i-1 and before the first appearance of i+1. Equivalently, if we had an array indexed 1..n and each entry was the first appearance of that element in b, then the array would be increasing if the arrangement is possible.
This array is maintained in the leaves of the segment tree. The nodes of the segment tree would contain the smallest value and largest value in the range and a bool to show if the range is increasing or not. There is a nice way to calculate the last bool, if the left child and right child are increasing and the largest value from the left is < the smallest value on the right, the entire range for the parent node is increasing. Here's my solution: https://codeforces.net/contest/2021/submission/285002624
To track the first position of occurrence of each element in b, we can use a set and look at the first element. Each element is also assigned a position m+i (or some large constant + i) so that if it doesn't appear, this is equivalent to it appearing beyond the end or after all other elements. Also, m+i means that if multiple elements are absent, then their positions with one another will be increasing.
Is this some advance kind of segment tree implementation? Bcoz I only know very basics of it
Not really, it's normal to have some additional information contained in segment tree nodes to help calculate the properties we really want (e.g. increasing subarray in this case). Though I haven't come across this use case before. Every time I see a new application of segment tree I just add it to the mental list of things a segment tree is capable of.
cool
That made sense. Thank You for the explanation and the code.
Problem A
How is the optimal strategy determined? I was just attempting the solution and what came to my head was that I should minimize the fractional loss due to floor because of which I used that approach as above? How would someone during contest reject such approach if that comes up in their mind?
Auto comment: topic has been updated by ArvinCiu (previous revision, new revision, compare).
Coming soon?
Coming s $$$\infty$$$ n
About Problem E3, I have a solution:
We can solve the E2 by using the divide and conquer:we find the MST of the graph and each time we find the edge with maximum value, then we separately solve the two part which is divided by this edge. Then we let vec[i][j] mean the ans (in the block which contains i and choose j point),then we find that we just need to complete a min-plus convolution.
Then we try to solve E3. Noticing the huge data, we guess that there maybe some amazing quality:f(i) satisfies convexity.
so we could maintain it's Differential array and then our merge-option becomes "sort". Using multiset and dsu on tree, we could complete E3 in O(nlog^2n).
https://codeforces.net/contest/2021/submission/285821599
As for the proof, I can not complete it. I check it by brute force.
Sorry, but can you explain (or give me a link), how we can complete DP faster than O(n^2)?
The solution to E2 is a DP in O(n^2) which is similar to “tree backpack”
And we can notice that the form of transiton in this dp is min-plus convolution(which means g(k) = min_{1 <= o <= k} f(o) + h(k — o)), and its convexity, we can apply minkowski sum to this process.
hello young people
Who has the solutions of the last race problems please tell me friends
10 days have passed, still no editorial for E. Atleast share editorials for easier versions of E :(
these are mf
ok
Auto comment: topic has been updated by ArvinCiu (previous revision, new revision, compare).
Approach for C2. read tutorial but didnt understand much. Is segment tree necessary for that?, using indices I over complicated it?
What if at some point. there are more than one possible longest paths? It can be proven that we can choose any of these paths and the optimal solutions for the next values of $$$k$$$ will still be optimal. The proof for this greedy strategy involves the convexity of the total length as $$$k$$$ goes from 1 to $$$n$$$ . However, we won't explain it in detail here.
Why you won't explain? I think it's the big part of the problem
I think I can explain, but I am not good at English.
First we construct the KRT, then we label all the special leaves according to the DFS order, it's easy to find those special leaves belonging to the same server-installed leaf have consecutive labels. So when there are $$$k$$$ server-installed leaves, all special leaves should be split into $$$k$$$ disjoint segments. As the KRT is binary tree, you can easily find the properties of these segments.
So when adding $$$k$$$ to $$$k + 1$$$, it will choose one segment and split it into two. Then we can reduce the problem to "Modifying $$$k$$$ from $$$1$$$ to $$$2$$$, the optimal solution should remain the first server-installed leaf", which is easy to be proved.