Блог пользователя WeakestTopology

Автор WeakestTopology, 3 месяца назад, По-английски

Hey all,

First, I'll give a brief overview of what finite calculus is about and I'll then show you how I used it to solve 1951G - Клацанье шарами. This may be not the most elegant way to solve that problem, but it requires few observations and I find it interesting enough to share it. I am also taking part in the Month of Blog Posts challenge.

You can find better introductions googling terms "finite calculus", "discrete calculus", etc, although I still have not found an in-depth, thorough, introduction for these ideas, only some short expositions scattered around the web. But these ideas are pervasive throughout competitive programming (think of adjacent differences, prefix sums, slopes and the like).

Disclaimer: this blog requires some mathematics background to fully understand it.

A short intro to finite calculus

If you know calculus, you have seen derivatives, integrals and the fundamental theorem of calculus. For a function $$$f$$$ defined on real numbers, we define it's derivative (if it exists) to be

$$$\displaystyle \frac{d}{dx}f(x) = \lim_{h \to 0} \frac{f(x + h) - f(x)}{h}. $$$

For the discrete case, $$$f$$$ defined on integers instead, we define it's (discrete) derivative to be

$$$ \Delta f(x) = f(x + 1) - f(x). $$$

This is exactly the argument in the limit above but with $$$h$$$ set to $$$1$$$. Derivatives are akin to differences and integrals are akin to sums. We have the "fundamental theorem of finite calculus":

$$$ f(b) - f(a) = \sum_{k=a}^{b - 1} \Delta f(k). $$$

It's just a telescoping sum, right? Looks simple and not so useful. We also have a converse statement: if $$$\Delta g = f$$$, then

$$$ g(n) = \sum_{k=1}^{n-1} f(k) + C $$$

for some constant $$$C$$$ (in other words, if $$$\Delta g = \Delta h$$$, then $$$g$$$ and $$$h$$$ differ by a constant). In fact, $$$C = g(1)$$$. Like with standard derivatives, the discrete derivative $$$\Delta$$$ decreases the degree of a polynomial exactly by one: $$$\Delta n^d = (n + 1)^d - n^d$$$, which is a polynomial in $$$n$$$ of degree $$$d - 1$$$.

I guess the first thing that can give you a thrill about finite calculus is that you can use it to show that

$$$\displaystyle S(n) = \sum_{k=1}^n k^d $$$

is a polynomial of degree $$$d + 1$$$. The proof of this follows from the previous facts and the next theorem:

Let $$$\mathcal{P}_d$$$ denote the space of polynomials of degree $$$d$$$. Then $$$\Delta$$$ as a linear transformation from $$$\mathcal{P}_{d + 1} \to \mathcal{P}_d$$$ is surjective.

Proof: If $$$\Delta f = 0$$$, then $$$f$$$ is constant, so $$$\operatorname{dim} \ker \Delta = 1$$$ and $$$\Delta$$$ is surjective by the rank-nullity theorem.

Now $$$\Delta S(n) = (n + 1)^d$$$, which is a polynomial. Using the theorem, we know there exists a polynomial $$$p$$$ such that $$$\Delta p(n) = (n + 1)^d = \Delta S(n)$$$, hence $$$S$$$ is also polynomial because it differs from $$$p$$$ by a constant at most.

One way of finding the coefficients of $$$S(n)$$$ is to manually compute $$$S(0), S(1), \dots, S(d)$$$. This yields a linear system with the Vandermonde matrix, which is known to be invertible. There are other ways of computing it, recursively with $$$d$$$, which you can find through web search. Computer algebra systems are also able to find these formulas and we're going to need them later on.

We can make several analogies with facts that come from standard calculus:

  • If $$$\Delta f = a$$$ is a constant, then is $$$f$$$ linear, e.g. $$$f(n) = an + b$$$,

  • If $$$\Delta f > 0$$$, then $$$f$$$ is strictly increasing,

  • If $$$\Delta^2 f \ge 0$$$, then $$$f$$$ is convex;

and so on.

Clacking balls

Now, please read the problem statement 1951G - Клацанье шарами, as it is already abridged enough. Our idea goes as follows:

Considering the gaps between consecutive balls, the process will continue until one of the gaps grows up to size $$$M$$$. If a gap shrinks to size $$$0$$$, it dies and is no longer considered. Let $$$X$$$ denote the random variable representing the number of seconds until the process finishes and $$$A_k$$$ the event that a gap of size $$$k$$$ eventually grows to $$$M$$$. With this in mind, we want to compute:

$$$\displaystyle E(X) = \sum_{k_i} P(A_{k_i})E(X|A_{k_i}), $$$

where the summation is over our set of gap sizes $$${k_1, k_2, \dots, k_n}$$$. As a shortcut, let's write $$$p_k = P(A_k)$$$ and $$$W_k = P(A_k)E(X|A_k)$$$.

The probabilities

This is the easy part. We know that $$$p_m = 1$$$ and $$$p_0 = 0$$$. Intuitively, we also know that $$$p_k$$$ must increase with $$$k$$$.

For a given gap, each second either one of three things happen:

  • Alice chooses the ball on its right end and the gap grows by $$$1$$$,

  • Alice chooses the ball on its left end and the gap shrinks by $$$1$$$,

  • Alice chooses any other ball, and the gap stays the same.

With equations, this means that:

$$$\displaystyle p_k = \frac{1}{n}p_{k-1} + \frac{1}{n}p_{k + 1} + \frac{n-2}{n}p_k, $$$

for $$$0 < k < m$$$. Multiplying by $$$n$$$ and rearranging it, we can rewrite this as $$$\Delta p_{k-1} = \Delta p_k$$$. That is, the derivative is constant and hence $$$p_k$$$ is linear in $$$k$$$:

$$$\displaystyle p_k = \frac{k}{m}. $$$

You can use the fundamental theorem of finite calculus to prove this with more details.

The expectations

With the conditional expectations, the reasoning is similar, but we must be more careful. We know that $$$E(X|A_m) = 0$$$ but note that $$$E(X|A_0)$$$ is undefined (once a gap shrinks to $$$0$$$ it will never grow again). Hence, the other boundary point is $$$E(X|A_1)$$$, which we know nothing about at first.

The reason for the definition of $$$W_k$$$ will become clear now. Assume $$$1 < k < m$$$ and let's use the definition of conditional expectation for discrete random variables:

$$$\displaystyle \begin{align*} E(X|A_k) &= \sum_{x \ge 1} \frac{x P(\{X = x\} \cap A_k)}{P(A_k)}\\ &= \sum_{x \ge 1} \frac{P(\{X = x\} \cap A_k)}{P(A_k)} + \sum_{x \ge 1} \frac{(x - 1) P(\{X = x\} \cap A_k)}{P(A_k)}\\ &= 1 + \sum_{x \ge 1} \frac{(x - 1) P(\{X = x\} \cap A_k)}{P(A_k)} \end{align*} $$$

As before, we can write

$$$\displaystyle P(\{X = x\} \cap A_k) = \frac{1}{n} P(\{X = x - 1\} \cap A_{k + 1}) + \frac{1}{n} P(\{X = x - 1\} \cap A_{k - 1}) + \frac{n - 2}{n} P(\{X = x - 1\} \cap A_k). $$$

Substituting this in the previous equation:

$$$ \begin{align*} E(X|A_k) &= 1 + \frac{1}{n}\frac{P(A_{k+1})}{P(A_k)} E(X|A_{k+1}) + \frac{1}{n}\frac{P(A_{k-1})}{P(A_k)} E(X|A_{k-1}) + \frac{n-2}{n} E(X|A_k). \end{align*} $$$

Multiplying by $$$nP(A_k)$$$ and rearranging it, we get:

$$$\displaystyle \Delta^2 W_{k-1} = \Delta W_k - \Delta W_{k - 1} = -nP(A_k) = -\frac{nk}{m}. $$$

For $$$k = 1$$$, we have

$$$\displaystyle P(\{X = x\} \cap A_1) = \frac{1}{n} P(\{X = x - 1\} \cap A_2) + \frac{n - 2}{n} P(\{X = x - 1\} \cap A_1) $$$

and, similarly,

$$$ E(X|A_1) = 1 + \frac{n-2}{n} E(X|A_1) + \frac{1}{n} \frac{P(A_2)}{P(A_1)} E(X|A_2), $$$

yielding

$$$\displaystyle \Delta W_1 = W_1 - \frac{n}{m}. $$$

We got ourselves a set of equations:

$$$ \begin{align*} \Delta^2 W_k &= -\frac{n(k + 1)}{m},\\ \Delta W_1 &= W_1 - \frac{n}{m},\\ W_m &= 0. \end{align*} $$$

Intuitively (and concretely, see this comment), it looks like a second order differential equation with two initial value conditions which should determine its solution uniquely. Indeed, let's see.

With the fundamental theorem:

$$$ \begin{align*} \Delta W_k &= \Delta W_1 + \sum_{\ell = 1}^{k - 1} \Delta^2 W_\ell\\ &= W_1 - \frac{n}{m} - \sum_{\ell = 1}^{k - 1} \frac{n(\ell + 1)}{m}\\ &= W_1 - \sum_{\ell = 1}^{k} \frac{n\ell}{m}\\ &= W_1 - \frac{n}{m}\frac{k(k + 1)}{2}. \end{align*} $$$

We just reduced the order of our differential equation by 1! Now for $$$W_k$$$ it's easier to start on other side, as we know $$$W_m$$$ but not $$$W_1$$$:

$$$ \begin{align*} W_{m - k} &= W_m - \sum_{\ell = 1}^k \Delta W_{m - \ell}\\ &= -kW_1 + \frac{n}{m} \sum_{\ell = 1}^k \frac{(m - \ell)(m - \ell + 1)}{2}\\ &= -kW_1 + \frac{n}{6m} k(k^2 - 3 k m + 3 m^2 - 1), \end{align*} $$$

where in the last step we used a computer algebra system to evaluate the sum (it involves just a sum of squares and other simpler terms, but no point in writing it all down).

Setting $$$k = m - 1$$$ in the above, we get the value of $$$W_1$$$ and with it we can compute the rest.

Implementation in Python:

from fractions import Fraction

T = int(input())
for _ in range(T):
    N, M = map(int, input().split())    
    A = list(map(int, input().split()))
    A = sorted(A)
    n = Fraction(N, 1)
    m = Fraction(M, 1)
    def R(K):
        k = Fraction(K, 1)
        return n / m * (k * (k * k - 3 * k * m + 3 * m * m - 1)) / 6
    W1 = R(m - 1) / m
    ans = Fraction(0, 1)
    for i in range(N):
        l = (A[(i + 1) % N] - A[i] + M) % M
        k = M - l
        ans += -k * W1 + R(k)
    print(ans)

Submitted C++ implementation: 281371585.

Concluding remarks

I hope you enjoyed reading it!!

Thanks to tfg for recommending me this problem while we were at the WF in Egypt and special thanks to alihew841 for spotting typos in the article.

  • Проголосовать: нравится
  • +453
  • Проголосовать: не нравится

»
3 месяца назад, # |
  Проголосовать: нравится +46 Проголосовать: не нравится

Hi, I'm reviving your blog just as -is-this-fft- asked

»
3 месяца назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

I'm actually surprised I was able to comprehend with the Linear Algebra and Probability used :) . Very clear explaination

»
3 месяца назад, # |
  Проголосовать: нравится +3 Проголосовать: не нравится

"Computer algebra systems are also able to find these formulas" In fact, there is a sense in which this was the first thing computer algebra systems ever did! (source: wikipedia.org/wiki/Note_G)

»
3 месяца назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Do you consider $$$n$$$ to be a constant in the part "The probabilities"? The way you defined it, $$$p_k$$$ is the probability of a process in which $$$n$$$ can decrease, as we aren't trying to choose a gap that vanished.

»
3 месяца назад, # |
  Проголосовать: нравится +8 Проголосовать: не нравится

this is actually pretty elegant

»
3 месяца назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Great article! It's fun to read, doesn't clutter content with unnecessary details, and shows practical applications. I'm looking forward to your future posts if you decide to write something else!

»
3 месяца назад, # |
  Проголосовать: нравится +1 Проголосовать: не нравится

Great blog, thanks!

Correct me if I'm wrong, but shouldn't this equation be

$$$\Delta^2 W_k = -\frac{n \cdot (k+1)}{m}$$$

instead of

»
3 месяца назад, # |
Rev. 2   Проголосовать: нравится +27 Проголосовать: не нравится

Alright, there is more to it, if you are familiar with generating functions and polynomial Taylor shifts (see adamant's blog). Just had this idea today and it concretizes the previous intuition about differential equations.

Let's take a look again at our set of equations:

$$$ \begin{align*} \Delta^2 W_k &= -\frac{n(k + 1)}{m},\\ \Delta W_1 &= W_1 - \frac{n}{m},\\ W_m &= 0. \end{align*} $$$

Note that we only defined $$$\Delta^2 W_k$$$ for integers $$$k \in [1, m - 2]$$$. Let us forget about this restriction and think of $$$\Delta^2 W_k$$$, $$$\Delta W_k$$$ and $$$W_k$$$ simply as usual polynomials in $$$k$$$ (allow their "continuation").

We know that, for a polynomial $$$P$$$,

$$$ P(x + 1) = (\exp(D)P)(x). $$$

Thus

$$$ \Delta P(x) = (\exp(D) - I)P(x), $$$

where $$$I$$$ is the identity operator. Of course, we can iterate this and we get $$$\Delta^2 P = (\exp(D) - I)^2 P$$$. Multiplying by $$$\frac{D}{\exp(D) - I}$$$ twice on both sides yields

$$$\displaystyle \frac{D^2}{(\exp(D) - I)^2} \Delta^2 P = D^2 P. $$$

Letting $$$P(x) := W_x$$$ and using the fact that

$$$\displaystyle \frac{D^2}{(\exp(D) - I)^2} = I - D + O(D^2) $$$

we get

$$$\displaystyle P"(x) = (I - D + O(D^2))\left(-\frac{n(x + 1)}{m}\right) = -\frac{n}{m}x. $$$

We got ourselves a differential equation! Integrating twice gives us two constants:

$$$\displaystyle P(x) = -\frac{n}{6m}x^3 + C_1x + C_2. $$$

The initial value conditions yield $$$C_1 = \frac{nm}{6}$$$ and $$$C_2 = 0$$$.

Solution: 281804074.