This editorial corresponds to 2017 Chinese Multi-University Training, BeihangU Contest (stage 1), which was held on Jun 25th, 2017.
There are 12 problems in total. You can solve them as a team member or an individual in a 5-hour contest. By the time you join as virtual participants, 770 teams, or even more, will compete with you virtually.
Editorial in the English version has been completed, which is a bit different from its Chinese version (mostly because I don't want bad editorials to ruin the contest, lol). However, for the sake of hiding spoilers, editorials are locked and will be shown as the following conditions are met:
- Editorials for the easiest 4 problems will be revealed after the replay (all unlocked);
- Each for the hardest 5 will be released if the corresponding problem has been solved by at least 5 users or teams on Codeforces::Gym (all unlocked);
- Each for the others will be published when the relevant problem has been solved by at least 10 users or teams in virtual participation (including the replay) on Codeforces::Gym (all unlocked).
Or you can find solutions in comments?
Idea: skywalkert
The answer is $$$\left \lfloor \log_{10} (2^m - 1) \right \rfloor$$$. Apparently, there does not exist $$$10^k = 2^m$$$ for positive integers $$$k$$$, $$$m$$$, which results in $$$\left \lfloor \log_{10} (2^m - 1) \right \rfloor = \left \lfloor \log_{10} 2^m \right \rfloor = \left \lfloor m \log_{10} 2 \right \rfloor$$$.
Idea: sd0061
Each letter's contribution to the answer can be treated as a number in the base of $$$26$$$, so the problem is equivalent to multiply these contributions with weights ranged from $$$0$$$ to $$$25$$$ and thus make the answer maximum. Obviously, the largest contribution matches $$$25$$$, the second-largest matches $$$24$$$, and so on. Doing this greedy algorithm by sorting could be enough, except for the only case where it might cause leading zeros, which is not permitted.
Time complexity is $$$\mathcal{O}\left(\sum_{i = 1}^{n}{|s_i|} + \max(|s_1|, |s_2|, \ldots, |s_n|) \log C\right)$$$, where $$$C = 26$$$.
Idea: sd0061
If we consider the number of paths covered by some color at least once for each color separately, we can conclude the answer is the sum of these numbers. Conversely, we can count how many paths that are not covered by this color, which is easy to compute. We can apply the idea of virtual trees to solve it directly, in which we do not really need to build them out. What we need to calculate is the total number of paths within any blocks after removing all the nodes of a specified color from the tree. We can maintain for each node of this color, the number of remaining nodes whose least ancestor of this color is this node. Together with the number of nodes which has no ancestor of this color, we can get the answer.
The process can be implemented directly by DFS (Depth-First Search) once, so the time complexity is $$$\mathcal{O}(n)$$$.
Idea: skywalkert
Apparently, each pile could be operated no more than $$$\sum_{i = 1}^{m}{e_i}$$$ (marked as $$$w$$$) times. Besides, let's define $$$f(x)$$$ as the number of ways to change a pile into one stone by changing exactly $$$x$$$ times, and then it is obvious that $$$f(x)$$$ equals to the number of ways to change a pile into two or more stones by changing $$$(x - 1)$$$ times.
To count the situations ending at the pile labeled $$$i$$$, let's assume and enumerate that this pile has been changed $$$x$$$ times, and then we know the number of corresponding ways is $$$f(x + 1)^{i - 1} f(x)^{k - i + 1}$$$. Hence, we can reduce problem into calculation of $$$f(x)$$$ in time complexity $$$\mathcal{O}(w k)$$$.
Let's consider situations counted in $$$f(x)$$$ and define that, in one possible way, $$$e_i$$$ is decreased by $$$d(i, j)$$$ in the $$$j$$$-th operation, and then we have
- $$$d(i, j)$$$ is a non-negative integer for each $$$(i, j)$$$; and
- $$$\sum_{j = 1}^{x}{d(i, j)} = e_i$$$ for each $$$e_i$$$; and
- $$$\sum_{i = 1}^{m}{d(i, j)} > 0$$$ for each $$$j$$$.
Furthermore, we can conclude that $$$f(x)$$$ equals to the number of different solutions $$$d(i, j)$$$ meeting all the above restrictions.
Let $$$g(x)$$$ be the number of corresponding ways meeting only the first two restrictions. We can observe that it is a combination of combinatorial problems for each $$$e_i$$$, and then figure out $$$g(x) = \prod_{i = 1}^{m}{e_i + x - 1 \choose x - 1}$$$.
We can also observe that if the last restriction is violated by some $$$j$$$, then all those related $$$d(i, j)$$$ must equal to zero. Applying the inclusion-exclusion principle, we have $$$f(x) = \sum_{y = 0}^{x}{(-1)^{x - y} {x \choose y} g(y)}$$$, which can be rewritten as $$$\frac{f(x)}{x!} = \sum_{y = 0}^{x}{\frac{(-1)^{x - y}}{(x - y)!} \frac{g(y)}{y!}}$$$, a formula in the form of convolution. Together with that $$$985661441 = 235 \times 2^{22} + 1$$$ is prime, we can apply NTT to speed up the convolution.
The total time complexity is $$$\mathcal{O}(w (m + \log n + k))$$$.
102253E - Expectation of Division
Idea: skywalkert
Let $$$f(n)$$$ be the expected number of operations replacing from $$$n$$$ to $$$1$$$. Specifically, $$$f(1) = 0$$$. Given the process described in the statement, we have
where $$$n > 1$$$, and $$$\sigma(n)$$$ is the number of positive factors of $$$n$$$.
Obviously, even if we try to furtherly simplify the formula, we can hardly compute $$$f(n)$$$ before determining $$$f(d)$$$ for all $$$d | n, d < n$$$. Based on this observation, our first step to solve this problem is to reduce the amount of calculation required for all $$$f(n)$$$. For two positive integers $$$x$$$ and $$$y$$$, if their prime factorizations $$$x = \prod_{i = 1}^{m}{{p_i}^{e_i}}$$$, $$$y = \prod_{i=1}^{m'}{{p'_i}^{e'_i}}$$$ meet the condition that $$$\lbrace e_i | i = 1, 2, \ldots, m \rbrace = \lbrace e'_i | i = 1, 2, \ldots, m' \rbrace$$$, then we can conclude $$$f(x) = f(y)$$$ by induction. Based on this conclusion, when computing $$$f(n)$$$, we can simply ignore its prime factors and only focus on the unordered multiset formed by the corresponding exponents in its prime factorization.
Our next step is to figure out the number of possible multisets is fairly small, with the restriction $$$n \leq 10^{24}$$$. If a multiset of exponents $$$E$$$ is possible, then the minimum possible $$$n$$$ corresponding to it must be no larger than $$$10^{24}$$$. Let the minimum possible $$$n$$$ for the multiset $$$E$$$ be $$$\mathrm{rep}(E)$$$ and the prime factorization of $$$\mathrm{rep}(E)$$$ be $$$\sum_{i = 1}^{m}{{p_i}^{e_i}}$$$, and we can conclude (by a standard interchange argument) that $$$e_u \geq e_v$$$ for all $$$1 \leq u, v \leq m$$$, $$$p_u < p_v$$$. Furthermore, as $$$m \leq \log_2 n$$$ and these prime factors must be the $$$m$$$ smallest prime numbers, all possible $$$\mathrm{rep}(E)$$$ can be detected by backtracking in time complexity $$$\mathcal{O}(|S|)$$$, where $$$S$$$ is the set of all possible multisets. When $$$n \leq 10^{24}$$$, $$$|S| = 172513$$$, which is not too large, so let's use hashing to memorize all corresponding $$$f(n)$$$, and optimize the calculation for each multiset.
For each multiset $$$E$$$, we cannot enumerate all factors of $$$\mathrm{rep}(E)$$$, because when $$$n \leq 10^{24}$$$, $$$\sigma(n) \leq 1290240$$$ and $$$\sum_{E \in S}{\sigma(\mathrm{rep}(E))} = 14765435692 \approx 1.5 \times 10^{10}$$$, which seems too gigantic to pass the test.
Another way is to apply the inclusion-exclusion principle. Let $$$g(n) = \sum_{d | n, d < n}{f(d)}$$$, $$$h(n) = g(n) + f(n)$$$, and then we have $$$f(n) = \frac{\sigma(n) + g(n)}{\sigma(n) - 1}$$$, and
where $$$n = \prod_{i = 1}^{\omega(n)}{{p_i}^{e_i}}$$$, $$$\omega(n)$$$ is the number of distinct prime factors of $$$n$$$, and $$$J = \lbrace p_i | i = 1, 2, \ldots, \omega(n) \rbrace$$$. However, just applying the above formula cannot easily pass the test, because when $$$n \leq 10^{24}$$$, $$$\omega(n) \leq 18$$$ and $$$\sum_{E \in S}{2^{\omega(\mathrm{rep}(E))}} = 103800251 \approx 10^8$$$, which is still too massive to pass. By the way, most calculations are additions and substractions on floating-point numbers, so maybe you can squeeze your program to pass the time limit (and actually some team did it).
Note that $$$h(n) = \sum_{d | n}{f(d)}$$$ is essentially computing a partial sum for $$$\omega(n)$$$-dimensional integer vectors, and can be improved by using extra space. Assuming $$$n = \prod_{i = 1}^{\omega(n)}{{p_i}^{e_i}}$$$ and $$$p_i < p_{i + 1}$$$ for $$$1 \leq i < \omega(n)$$$, let's define that
for $$$0 \leq k \leq \omega(n)$$$, and then we can get $$$g(n, k)$$$ from $$$h(n, k - 1)$$$ and $$$h(n / p_k, k')$$$, where $$$k'$$$ is either $$$k$$$ or $$$(k - 1)$$$, which only depends on whether $$$n / p_k$$$ is a multiple of $$$p_k$$$.
Finally, we have to replace $$$n$$$ by the corresponding multiset $$$E$$$, where we had better choose the minimum possible $$$n$$$ as $$$\mathrm{rep}(E)$$$ and only calculate $$$h$$$, $$$g$$$ for all $$$\mathrm{rep}(E)$$$, because its exponents are sorted in non-decreasing order, which can keep the number of related states minimized.
After all these simplifications, we find a solution that runs in time and space complexity $$$\mathcal{O}\left(\sum_{E \in S}{\omega(\mathrm{rep}(E))}\right) = \mathcal{O}(|S| \omega(n))$$$. When $$$n \leq 10^{24}$$$, $$$\sum_{E \in S}{\omega(\mathrm{rep}(E))} = 1173627 \approx 10^6$$$, which is small enough.
In practice, arithmetic operations on huge integers are not complicated to code and can be implemented in constant time, so the total time complexity is $$$\mathcal{O}(|S| \omega(n) + T \log n)$$$, where $$$T$$$ is the number of test cases.
Idea: chitanda
A permutation can be decomposed into one or more disjoint cycles, which are found by repeatedly tracing the application of the permutation on some elements.
For each element $$$i$$$ in a $$$l$$$-cycle of permutation $$$a$$$, we have
which implies $$$f(i)$$$ must be some element in a cycle of permutation $$$b$$$, whose length is a factor of $$$l$$$.
Besides, if $$$f(i)$$$ is fixed, the other $$$(l - 1)$$$ numbers in this cycle can be determined by $$$f(i) = b_{f(a_i)}$$$.
Consequently, the answer is $$$\prod_{i = 1}^{k} \sum_{j | l_i} {j \cdot c_j}$$$, where $$$k$$$ is the number of cycles obtained from permutation $$$a$$$, $$$l_i$$$ represents the length of the $$$i$$$-th cycle, and $$$c_j$$$ indicates the number of cycles in permutation $$$b$$$ whose length are equal to $$$j$$$.
Due to $$$\sum_{i = 1}^{k} \sum_{j | l_i}{1} \leq \sum_{i = 1}^{k}{2 \sqrt{l_i}} \leq 2 \sqrt{k \sum_{i = 1}^{k}{l_i}} \leq 2 n$$$, the time complexity is $$$\mathcal{O}(n + m)$$$.
Idea: constroy
The gears with their adjacency relations form a forest. We can consider coaxial gears, which have the same angular velocity, as a block and then consider the effect of gear meshing.
If the $$$x$$$-the gear and the $$$y$$$-th one mesh, let their radii be $$$r_x$$$, $$$r_y$$$ respectively, and let their angular velocities be $$$\omega_x$$$, $$$\omega_y$$$ respectively, and then we have $$$\ln \omega_y = \ln \omega_x + \ln r_x - \ln r_y$$$. Hence, for a particular gear in a connected component, we can determine the difference of angular velocities between this gear and any other gear in the component, and then maintain the maximum relative difference using a segment tree on the traversal sequence obtained from DFS (Depth-First Search).
More specifically, let's fix a gear in each component as the reference gear and maintain the differences for all gears in this component. Let the fixed gear be the root of this component, and then we build a segment tree to maintain the maximum value on the rooted tree structure of this component.
When a gear is replaced, the angular velocity of its block may be changed if it is the shallowest node in this block, and the angular velocity of other blocks may be changed if these blocks are completely contained in the subtree of the changed node. No matter in which case, there is only an interval of the traversal sequence corresponding to the nodes needed to update by adding a common offset on their record values.
For each query, we only need to get the difference between the activated gear and the reference gear. Together with the maximum relative difference in the component, the real maximum angular velocity can be determined.
In practice, you can maintain $$$\log_2 \omega$$$ to avoid possible precision issues and only convert it into $$$\ln \omega$$$ for the output.
Time comlexity is $$$\mathcal{O}(n + m + q \log n)$$$.
Idea: constroy
Let's sort these hints in non-increasing order and remove duplicates. Denote the sorted array as $$${b'}_{1}$$$, $$${b'}_{2}$$$, $$$\ldots$$$, $$${b'}_{m'}$$$ satisfying $$${b'}_{i} < {b'}_{i - 1}$$$ for each $$$i \geq 2$$$, and define that $$${b'}_{0} = n$$$.
One solution to this problem is to apply a linear selection algorithm, which can find the $$$k$$$-th smallest element in an array of length $$$n$$$ in time complexity $$$\mathcal{O}(n)$$$ and also split the array into three parts that include the smallest $$$(k - 1)$$$ elements, the $$$k$$$-th element and other elements respectively.
Due to $$$b_i + b_j \leq b_k$$$ if $$$b_i \neq b_j$$$, $$$b_i < b_k$$$, $$$b_j < b_k$$$, we have $$$2 {b'}_{i} < {b'}_{i - 2}$$$ for each $$$i \geq 3$$$. We can conclude $$$\sum_{k}{{b'}_{2 k + 1}} < 2 {b'}_{1}$$$ and so for elements on even positions.
If the algorithm is completely linear, the time complexity should be $$$\mathcal{O}\left(\sum_{i = 0}^{m' - 1}{{b'}_{i}}\right) = \mathcal{O}(n)$$$. Since ratings are almost generated at random, we can instead utilize an almost linear approach, such as nth_element
function in C++. By the way, there exist solutions in expected time complexity $$$\mathcal{O}(\sqrt{n m'})$$$.
Idea: sd0061
As the graph is a cactus graph, it is no doubt that one edge of each cycle needs to be removed in order to make a spanning tree. Hence, the problem can be rewritten as that we have $$$M$$$ ($$$M \leq \frac{m}{3}$$$) arrays consisting of integers and we want to pick a number from each array so that we can get their sum, however, we want you to find possible ways with the largest $$$K$$$ sums and report the sum of these values.
It is a classic problem that can be solved by sequence merge algorithms, for example, we can maintain a set of $$$K$$$ largest values obtained from the first $$$x$$$ arrays, and then merge it with the $$$(x + 1)$$$-th array and find $$$K$$$ largest new values by using a heap or priority queue. More specifically, let's assume that we are going to merge two non-increasing arrays $$$A$$$, $$$B$$$ and pick $$$K$$$ largest values obtained from $$$(A_i + B_j)$$$. As we know $$$B_j \geq B_{j + 1}$$$, so if we pick $$$(A_i + B_{j + 1})$$$, we must pick $$$(A_i + B_j)$$$ first. That inspires us to set a counter $$$c_i$$$ for each $$$A_i$$$, which represents if we will pick $$$A_i$$$ with some value, we are going to pick $$$(A_i + B_{c_i})$$$. Then, we can use a data structure to maintain the set of all possible $$$(A_i + B_{c_i})$$$ and then query and erase the largest value, which we can just repeat at most $$$K$$$ times to get all the merged values we need.
It seems the above method runs in time complexity $$$\mathcal{O}(MK \log K)$$$, but actually, it can run faster. Let the sizes of these arrays are $$$m_1$$$, $$$m_2$$$, $$$\ldots$$$, $$$m_M$$$ ($$$m_i \geq 3$$$ for each $$$i$$$) respectively. If we instead to maintain $$$(A_{c_j} + B_j)$$$ in the data structure, where $$$B$$$ is the next array to be merged, the complexity will be $$$\mathcal{O}\left(\sum_{i = 1}^{M}{K \log{m_i}}\right) = \mathcal{O}\left(K \log{\prod_{i = 1}^{M}{m_i}}\right) = \mathcal{O}\left(M K \log{\frac{\sum_{i = 1}^{M}{m_i}}{M}}\right) = \mathcal{O}\left(M K \log{\frac{m}{M}}\right)$$$. As $$$M \leq \frac{m}{3}$$$, we can conclude the worst complexity is $$$\mathcal{O}(m K)$$$, when $$$M = \frac{m}{3}$$$.
By the way, there exist solutions in time complexity $$$\mathcal{O}(K \log K)$$$.
102253J - Journey with Knapsack
Idea: skywalkert
The main idea of our standard solution is ordinary generating function. Let's define the number of ways to choose food of total volume $$$k$$$ as $$$f(k)$$$, and its generating function as $$$F(z) = \sum_{k \geq 0}{f(k) z^k}$$$. After calculating the polynomial $$$(F(z) \bmod z^{2 n + 1})$$$ with coefficients in modulo $$$(10^9 + 7)$$$, we can enumerate one of equipment and calculate the answer.
Based on the rule of product, we have
Due to $$$0 \leq a_1 < a_2 < \ldots < a_n$$$, we know that $$$a_i \geq i - 1$$$ and thus $$$(a_i + 1) i \geq i^2$$$, which implies there are only $$$\mathcal{O}(\sqrt{n})$$$ items $$$\left(1 - z^{(a_i + 1) i}\right)$$$ that are not equivalent to $$$1$$$ in modulo $$$z^{2 n + 1}$$$. Hence, we can calculate the numerator of $$$(F(z) \bmod z^{2 n + 1})$$$ in time complexity $$$\mathcal{O}(n \sqrt{n})$$$.
The rest is similar to the generating function of partition function. That is defined as
where $$$p(k)$$$ represents the number of distinct partitions of a non-negative integer $$$k$$$. Pentagonal number theorem states that
which can help us calculate the polynomial $$$(P(z) \bmod z^m)$$$ in time complexity $$$\mathcal{O}(m \sqrt{m})$$$. Besides, $$$1 - x^k \equiv 1 \equiv \frac{1}{1 - x^k} \pmod{x^m}$$$ for any integers $$$k \geq m \geq 1$$$, so we have
and we can get the denominator of $$$(F(z) \bmod z^{2 n + 1})$$$ from $$$(P(z) \bmod z^{2 n + 1})$$$ easily.
The total time complexity can be $$$\mathcal{O}(n \sqrt{n})$$$ if we calculate the denominator as a polynomial first, and then multiply it with each term included in the numerator one by one. By the way, if you are familiar with polynomial computation, you can solve the problem in time complexity $$$\mathcal{O}(n \log n)$$$.
Idea: chitanda
The answers for fixed $$$n$$$ form an almost periodic pattern. It is not difficult to find out that is $$$\underbrace{1, 2, \ldots, n}_{n \text{ numbers}}$$$, $$$\underbrace{1, 2, \ldots, n - 1}_{(n - 1) \text{ numbers}}$$$, $$$\underbrace{1, 2, \ldots, n - 2, n}_{(n - 1) \text{ numbers}}$$$, $$$\underbrace{1, 2, \ldots, n - 1}_{(n - 1) \text{ numbers}}$$$, $$$\underbrace{1, 2, \ldots, n - 2, n}_{(n - 1) \text{ numbers}}$$$, $$$\ldots$$$
Idea: skywalkert
According to $$$(l_i, r_i)$$$, we can build the only Cartesian tree linearly. If there is any contradiction, the answer should be $$$0$$$. Besides, it is not difficult to conclude the Cartesian tree, if exists, is unique.
More specifically, we can find out each node of the tree recursively. For example, the root must cover the interval $$$[1, n]$$$, and if a node $$$u$$$ cover the interval $$$[L, R]$$$, then its left child, if exists, must cover $$$[L, u - 1]$$$, and its right child, if exists, must cover $$$[u + 1, R]$$$. If there is no position $$$u$$$ covering the whole interval $$$[L, R]$$$, we will find a contradiction. If there are multiple choices, we will find a contradiction too, but we can just choose any of them as $$$u$$$ because further checking will detect contradictions. Hence, after sorting given intervals by radix sort, we can construct the tree linearly.
The counting can be done recursively as well. Let $$$s(u)$$$ be the size of the subtree of the node $$$u$$$, which is also equal to $$$(r_u - l_u + 1)$$$, and $$$f(u)$$$ be the number of permutations obtained from $$$1$$$ to $$$s(u)$$$ that can form the same Cartesian tree as the subtree of the node $$$u$$$. If the node $$$u$$$ has two chilldren $$$v_1$$$, $$$v_2$$$, we have $$$f(u) = {s(v_1) + s(v_2) \choose s(v_1)} f(v_1) f(v_2)$$$. By the way, if we explore a bit more, we will find the answer is also equal to $$$\frac{n!}{\prod_{i = 1}^{n}{s(i)}}$$$.
Time complexity is $$$\mathcal{O}(n)$$$. The bottleneck is on reading the input data.
Hope you can enjoy it. Any comments, as well as downvotes, would be appreciated.