Welcome to Part 2 of my tutorial on generating functions. The first part focused on introducing generating functions to those without any background in generating functions. In this post, I will demonstrate a few applications of generating functions in CP problems. Let us start with some relatively straightforward examples.
Note: Unless stated otherwise, all computations are done modulo a convenient prime (usually $$$998244353$$$). Also, $$$[n]$$$ denotes the set $$$\{1,2,...,n\}$$$.
Blatant Applications in Counting Problems
Problem. AGC 005 Problem F You have a tree $$$T$$$ with $$$n$$$ vertices. For a subset $$$S$$$ of vertices, let $$$f(S)$$$ denote the minimum number of vertices in a subtree of $$$T$$$ which contains all vertices in $$$S$$$. For all $$$1 \le k \le n$$$, find the sum of $$$f(S)$$$ over all subsets $$$S$$$ with $$$|S| = k$$$.
Constraints: $$$n \le 2 \cdot 10^{5}$$$.
Convolutions of this type appear in countless FFT problems and usually the hard part is to reduce your sums into a form that can be seen as the convolution of two polynomials. However, generating functions are much more powerful than this, as we shall see the next examples.
Problem. The Child and Binary Tree You have a set of positive integers $$$C = \{c_1, c_2, ..., c_{n}\}$$$. A vertex-weighted rooted binary tree is called good if all the vertex weights are in $$$C$$$. The weight of such a tree is the sum of weights of all vertices. Count the number of distinct good vertex-weighted rooted binary trees with weight $$$s$$$ for all $$$1 \le s \le m$$$. Note that the left child is distinguished from the right child in this problem (see the samples for more info).
Constraints: $$$n, m \le 10^{5}$$$
Problem. Descents of Permutations (from Enumerative Combinatorics 1) Call a permutation $$$p_1, p_2, ..., p_n$$$ of $$$[n]$$$ $$$k$$$-good if $$$p_{i}>p_{i+1}$$$ iff $$$k \mid i$$$. Find the number of $$$k$$$-good permutations of length $$$n$$$.
Constraints: $$$n, k \le 5 \cdot 10^{5}, n \ge 3, k \ge 2$$$
Expected Value of Stopping Time
I think this is also a beautiful application of generating functions. The recent problem Slime and Biscuits can be solved using the trick I will demonstrate here (there is a short editorial using this method here). Let's look at a different example.
Problem. Switches There are $$$n$$$ switches, each of which can be on or off initially. Every second, there is a probability of $$$\frac{p_{i}}{S}$$$ that you will flip the state of the $$$i$$$-th switch. The game ends when all switches are off. What is the expected number of seconds the game will last?
Constraints: $$$n \le 100$$$, $$$\sum p_{i} = S$$$, $$$p_{i}>0$$$, $$$S \le 50000$$$
In general, the trick of introducing $$$A$$$ and $$$B$$$ can be used to solve other problems that asks for the first stopping time of some system if you have multiple possible ending states and the time taken to reach from one ending state to another is equal, and the probability to reach an ending state in a fixed amount of moves is easy to compute.
Roots of Unity Filter
Next, we introduce a trick that is more rarely used in competitive programming but nevertheless interesting to learn. The motivation is the following classic problem.
Problem. Roots of Unity Filter Find the sum $$$\displaystyle\sum_{i \equiv r \pmod{m}}\binom{n}{i}$$$ modulo an arbitrary $$$MOD$$$.
Constraints: $$$1 \le n \le 10^{18}, 2 \le m \le 2000, 0 \le r \le n - 1, 10^{8} \le MOD \le 10^{9}$$$
Next, we look at a nice harder example.
Problem. Rhyme Compute the sum of $$$\frac{n!}{x_{1}!x_{2}!...x_{k}!}$$$ over all tuples of positive integers $$$(x_{1},x_{2},...,x_{k})$$$ such that $$$d|x_{i}$$$ and $$$\sum x_{i} = n$$$, modulo $$$19491001$$$ (a prime).
Constraints: $$$n \le 10^{9}, k \le 2000, d \in \{4, 6\}$$$.
Generating Functions that you can't compute "naively"
In some problems, the constraints might be too large to even compute the generating functions you found with FFT or algorithms involving polynomial operations. In this case, you usually need to analyze some special properties of the generating function you are dealing with (and thus it is helpful to recognize some common generating functions).
We start with a relatively easy example.
Problem. Perfect Permutations Find the number of permutations of length $$$n$$$ with exactly $$$k$$$ inversions, modulo $$$10^{9}+7$$$.
Constraints: $$$1 \le n \le 10^{9}$$$, $$$0 \le k \le 10^{5}$$$, $$$n \ge k$$$. There are $$$100$$$ testcases per input file.
Next, we look at a problem that has a natural statement in generating functions, but it turns out that the computation in generating functions is quite tricky. The official editorial has a nice and simpler solution using a magical observation, but to demonstrate the power of generating functions I will show an alternative method (which seems more straightforward and generalizable).
Problem. Sum of Fibonacci Sequence Let $$$d_{n,k}$$$ be defined by the recurrence $$$d_{1,1}=d_{1,2}=1$$$, $$$d_{1,k}=d_{1,k-1}+d_{1,k-2}$$$ for $$$k \ge 3$$$, and $$$d_{n,k}=\displaystyle\sum_{i=1}^{k}d_{n-1,i}$$$ for $$$n \ge 2$$$, $$$k \ge 1$$$.
Compute $$$d_{n,m}$$$ modulo $$$998244353$$$.
Constraints: $$$1 \le n \le 200000$$$, $$$1 \le m \le 10^{18}$$$
To end this section, we conclude with a problem that heavily relies on linear recurrences. Actually, it might be a stretch to call this a generating function problem but I just want to demonstrate the trick of using generating functions to compute linear recurrences which is basically the same as the one shown here.
Problem. Sum Modulo You have a number $$$x$$$ which is initially $$$K$$$. Every second, for $$$1 \le i \le n$$$, there is a probability $$$p_{i}$$$ that you will replace $$$x$$$ with $$$(x - i) \pmod{M}$$$. Find the expected number of moves before the counter goes to $$$0$$$. $$$p_{i}$$$ are given as $$$\frac{A_{i}}{\sum A_{i}}$$$ for some positive integers $$$A_{1}, A_{2}, ..., A_{n}$$$ and your output should be modulo $$$998244353$$$ (and it is guaranteed that you don't have to worry about division by $$$0$$$).
Constraints: $$$1 \le n \le \min(500,M-1)$$$, $$$2 \le M \le 10^{18}$$$, $$$1 \le A_{i} \le 100$$$
Lagrange Inversion Formula
Finally, inspired by this Div. 1 F problem, I looked up on some applications on Lagrange Inversion Formula and found a few examples on it. I would like to end this article by demonstrating a few applications of it.
Some examples here are from Enumerative Combinatorics Volume 2 and some are from jcvb's Chinese paper on generating functions.
The idea of the Lagrange Inversion Formula is that sometimes we want to find the compositional inverse of a function but it is difficult to find. However, the coefficients of this inverse function might have a simpler formula, which we can obtain from Lagrange Inversion Formula.
There are many variants of stating the Lagrange Inversion Formula, so I will show what I think is the most helpful version of it (also given in this comment).
Theorem. Let $$$F(x), G(x)$$$ be formal power series which are compositional inverses (i.e. $$$F(G(x)) = x$$$). Suppose $$$F(0)=G(0)=0$$$, $$$[x^{1}]F(x) \neq 0$$$, $$$[x^{1}]G(x) \neq 0$$$, then
$$$[x^{n}]G(x) = \frac{1}{n}[x^{-1}]\frac{1}{F(x)^{n}}$$$
Also, for any power (or Laurent) series $$$H(x)$$$, we have
$$$[x^{n}]H(G(x)) = \frac{1}{n}[x^{-1}]H'(x)\frac{1}{F(x)^{n}}$$$
Note: Laurent Series can be intuitively seen as the generalization of power series where the powers can go negative.
Intuitively, if you "know" how to compute $$$F(x)$$$, then you can also get the coefficients of the compositional inverse of $$$F(x)$$$. Let's go through a few examples.
Tree Enumeration
Problem. Count the number of labelled trees on $$$n$$$ vertices (number of trees where vertices are labelled).
Number of $$$2$$$-edge connected graphs
Problem. Find the number of labelled $$$2$$$-edge connected graphs on $$$n$$$ vertices. A graph is $$$2$$$-edge connected graphs if it has no bridges, i.e. removing any edge does not disconnect the graph.
Constraints: $$$n \le 3 \cdot 10^{5}$$$
Coefficient of fixed $$$x^{k}$$$ in $$$f(x)^{i}$$$
This is a more of a trick than a specific problem. Let $$$f(x)$$$ be a power series with a compositional inverse ($$$[x^{0}]f(x) = 0$$$, $$$[x^{1}]f(x) \neq 0$$$). We can find the coefficient of $$$x^{k}$$$ (assume $$$k \ge 1$$$) in $$$f(x)^{i}$$$ for all $$$1 \le i \le n$$$ in $$$O(n\log n)$$$ time (assume $$$k = O(n)$$$).
Let $$$ans(i)$$$ denote the answer for fixed $$$i$$$. Instead of looking at $$$ans(i)$$$ as a sequence, let's introduce a new variable $$$u$$$ and consider the OGF
$$$A(u) = ans(0) + ans(1)u + ans(2)u^{2} + ... = \displaystyle\sum_{n \ge 0}[x^{k}]f(x)^{n}u^{n} = [x^{k}]\displaystyle\sum_{n \ge 0}(f(x)u)^{n} = [x^{k}]\frac{1}{1 - uf(x)}$$$.
Since $$$f(x)$$$ has a compositional inverse (say $$$g(f(x)) = x$$$), by Lagrange Inversion formula (with $$$H(x) = \frac{1}{1 - ux}$$$), we obtain
$$$[x^{k}]\frac{1}{1-uf(x)} = \frac{1}{k}[x^{-1}]\left(\frac{1}{1-ux}\right)'\left(\frac{1}{g(x)^{k}}\right) = \frac{1}{k}[x^{k-1}]\left(\frac{1}{1-ux}\right)'\left(\frac{1}{\left(\frac{g(x)}{x}\right)^{k}}\right)$$$.
Note that by Quotient Rule, $$$\left(\frac{1}{1-ux}\right)' = \frac{u}{(1-ux)^{2}}$$$.
Our goal is to rewrite our sum in a clear manner so that we can "read off" the coefficients of $$$u^{i}$$$. We try to change the problem of finding the coefficients of $$$u^{i}$$$ into a problem about purely finding the coefficients of $$$x^{j}$$$ of some function.
The idea is to expand the series
$$$\frac{u}{(1-ux)^{2}} = u(1-ux)^{-2} = u\displaystyle\sum_{i \ge 0}\binom{i+1}{1}(ux)^{i}$$$ (recall how to expand $$$(1-x)^{-2}$$$)
$$$= u^{i+1}\displaystyle\sum_{i \ge 0}(n+1)x^{i}$$$, thus
$$$[x^{k}]\frac{1}{1-uf(x)} = \frac{1}{k}[x^{k-1}]\displaystyle\sum_{i \ge 0}(i+1)x^{i}u^{i+1} \left(\frac{1}{\left(\frac{g(x)}{x}\right)^{k}}\right)$$$
Let's look at $$$ans(i+1)$$$, the coefficient of $$$u^{i+1}$$$. We have
$$$ans(i+1) = [u^{i+1}]\frac{1}{k}[x^{k-1}]\displaystyle\sum_{i \ge 0}(i+1)x^{i}u^{i+1} \left(\frac{1}{\left(\frac{g(x)}{x}\right)^{k}}\right) = \frac{i+1}{k}[x^{k-i-1}]\frac{1}{\left(\frac{g(x)}{x}\right)^{k}}$$$.
Now, our problem reduces to computing the coefficients of one fixed function $$$P(x) = \frac{1}{\left(\frac{g(x)}{x}\right)^{k}}$$$, which we can compute the first $$$n$$$ terms of using the usual polynomial operations! Thus, we can compute $$$ans(i)$$$ for all $$$i$$$ in $$$O(n\log n)$$$ time!
If $$$f(x)$$$ does not have a compositional inverse, it is possible to "adjust" our function $$$f$$$ (create a new function related to $$$f$$$) so that it has a compositional inverse. I leave this as an exercise.
Final Boss: Div 1 F — Slime and Sequences
As the grand finale of this 2-part article, I would like to discuss the recent very difficult Div. 1 F problem which was the inspiration of the entire blog in the first place.
Problem. Slime and Sequences A sequence of positive integers $$$s$$$ is called good if for each $$$k>1$$$ that is present in $$$s$$$, the first occurrence of $$$k-1$$$ (which must exist) must occur before the last occurrence of $$$k$$$. Count the number of times each integer $$$1 \le i \le n$$$ appears over all good sequences of length $$$n$$$.
Constraints: $$$1 \le n \le 100000$$$
If you have any questions or spotted any errors, please tell me in the comments. There are probably more cool applications of generating functions in CP that I am not aware of, so feel free to share them in the comments too. :)
I am really curios about how many lines is the blog, it took my PC about a minute to load it.
Also thanks for the blog, i really needed some mathforces.
Thanks to spacewalker's feedback, I have added spoiler tags for solutions to make the post loadable.
The entire article is $$$24$$$ pages of LaTeX on overleaf (which decided that I overload their server too much and stopped autocompiling my LaTeX code near the end).
Thanks for adding the spoilers.
Auto comment: topic has been updated by zscoder (previous revision, new revision, compare).
For the pentagonal number theorem, I think there is a slight error in the formula. It should be: $$$1 + \displaystyle\sum_{i \ge 1}(-1)^{i}\left(x^{\frac{i(3i+1)}{2}} + x^{\frac{i(3i-1)}{2}}\right)$$$
Thanks, fixed it.
Great guide! I wrote a problem for the May 2020 CodeChef Long Challenge which can be solved with generating functions, Chef and Rainbow Road. The problem boils down to a lot of different standard operations on polynomials which I cover extensively in the editorial, with the hope that it could also be used as a resource for generating function problems in the future.
Reciprocal of a Power Series, Fast Polynomial Long Division, Multi-Point Evaluation, Fast Partial Fractions of Linear Factors
Yeah somehow pointed this problem out to me when I was planning to write this tutorial. Your editorial seems to cover how to perform standard polynomial operations fast so I think it is a good complement for this blog, so for those who wants to learn more about these operations I recommend your editorial.
Would you mind posting your editorial for that here, since codechef solutions seem to now be paywalled? Thank you!
Codechef Editorial
Really great blog,I need a little help and guidance about, as I am interested in mathematical reading, I want to know how do u guys prefer to study maths.like solving problems, reading books, blogs. Also, I noticed all the great coders know great resources, if all the visitors here,could you link your favourite websites or blogs(any language) for maths that they liked or use as reading. Thank you in advance. Please comment it would be really helpful for me and others like me.
Some sites I know are cut the knot,AOPS,maths.stackexchange,some books like 104 combinatorial problems, generating functionology etc.
Looking forward to more top coders helping the community by giving back what they learned.
Thanks, please post more math content like these!
I'm wondering if you could put the [cut] tag on your two gen function articles near the beginning. I sometimes like to favorite blogs I think would be useful to read later, such as I have your gen functions ones, but it makes it hard to find blog i'm looking for if there are many long blogs that don't have the [cut] tag for the short preview version. I also wish other people who are writing great but longs blogs to do this to more often. Besides that your blogs looks great!
You can do it using a divide-and-conquer approach with FFT in O(klog2k)
Can you tell the exact approach as to how to do this?
basically that means you recursively divide the thing you need to multiply in the middle until you have one or two polynomials in each group, FFT them and go into the previous layer, do the same. The process is pretty much like a segment tree, where you yield the result of a node from its two sons. During calculation, only keeping those with index smaller than k should be enough.
If I misunderstood anything please point it out. Thanks.
yeah, but during recursion, in the leafs, you have (1-x^i), so for fft, you need to create a polynomial of length atleast i. So in total, it'll be O(K^2) atleast.
Right, I was actually thinking about the segment tree approach, but then I mixed up $$$(1-x^i)$$$ as $$$(x-i)$$$ and thought that it would work the same but (from what I see now) I can't find a way to handle higher degrees oops.
Thank you for this amazing post!
Another application of generating functions that I encountered multiple times is generating a random string by appending random characters at the end. The most basic problem is 1677 from Timus. In this comment I explained how to solve this problem with generating functions as well as a harder problem from Prime New Year Contest with multiple strings.
Also, problem K in ICPC Finals 2017 was on this topic, but the proof is impossible to come up with during contest, so everyone just guessed the correct solution without proving.
A good problem on this topic: 623E - Transforming Sequence
thanks for your blog!
In the solution of the "The Child and Binary Tree" there is statement:"You can verify that the constant term of the denominator is nonzero". Can someone explain why it's true?
In the last question could you please elaborate how would one go about finding C(x)=f(A^-1(x)) after finding A^-1(x)
I found another, possibly simpler way to achieve the same result in 438E - The Child and Binary Tree.
Let $$$C(x) = \sum\limits_{c \in C} x^C$$$ and $$$T(x) = \sum\limits_{n=0}^\infty t_n x^n$$$, where $$$t_n$$$ is the number of binary rooted trees with $$$n$$$ vertices. Notice that $$$t_n = C_n$$$ — the $$$n$$$-th Catalan number, so we instantly get
Now notice that if we fix a binary rooted tree with $$$n$$$ vertices, then the series $$$C(x)^n$$$ describes the number of ways to select weights for those $$$n$$$ vertices for each possible $$$s$$$ — the sum of weights (that number of ways is exactly the coefficient of $$$x^s$$$ in $$$C(x)^n$$$). Thus, the OGF $$$F(x)$$$ of answer sequence is equal to:
Then the solution continues in the same way.
The fact that $$$t_n$$$ is the $$$n$$$-th Catalan number can be proved by noticing the same recurrence formula $$$t_n = \sum_{l+r=n-1} t_lt_r$$$
I remember you were purple just like me. Now you're GM, my friend. What went wrong, I'm asking.
omg zscoder orz