Part1: Introduction
Atcoder Beginner Contest 279 is good for learning generating functions (GFs). The English version tutorial of 279Ex has already been posted on the MathStackExchange by a kind person. Unfortunately, the English tutorial on 279G is not available. Now I would like to make a second-hand tutorial on 279G based on PCTprobability's idea. I spend a huge amount of time understanding this idea. For contestants at about my level, it is quite difficult to understand
the idea even if it is written in English, let alone it is written in Japanese only (My Japanese is N5 level). I will make the following contributions.
(1)Write the tutorial in English. The original tutorial is in Japanese only.
(2)Fill in the details. I believe you can understand my words.
(3)Offer an accepted implementation.
I have to state that, using generating function is definitely not the best way to solve this problem. It could be solved much simpler by using dynamic programming with monotone deque optimization. However, this problem is also a good chance to learn generating functions (GFs). Generating functions encode the information of sequences in a continuous way. If you do not know anything about GFs, I suggest you read:
(1) How to use GFs to solve recurrence? Link.
(2) How to prove the Vandemonde convolution identity using GFs? Link.
(3) And most relevant to this problem, how to solve partitions using GFs? Link.
And the most important notation $$$[x^k]f(x)$$$ denotes the coefficient of $$$x^k$$$ in function $$$f(x)$$$. For example, $$$[x^2](x^3+2x^2+1)=2$$$. And $$$[x^2]\frac{1}{1-2x}=4$$$, because we can expand $$$\frac{1}{1-2x}$$$ to $$$\sum\limits_{i=0}^\infty (2x)^i$$$ in the region of convergence $$$(-\frac{1}{2}, \frac{1}{2})$$$. So the coefficient of $$$x^2$$$ is $$$4$$$.
Part2: Problem Statement
The problem says: There is a grid with $$$1×N$$$ squares, numbered $$$1,2,…,N$$$ from left to right.
Takahashi prepared paints of $$$C$$$ colors and painted each square in one of the C colors. Then, there were at most two colors in any consecutive K squares. Formally, for every integer $$$i$$$ such that $$$1≤i≤N−K+1$$$, there were at most two colors in squares $$$i,i+1,…,i+K−1$$$.
In how many ways could Takahashi paint the squares? Since this number can be enormous, find it modulo $$$998244353$$$.
$$$\cdot \text{All inputs are integers.}$$$
$$$\cdot 2 \leq K \leq N \leq 10^6$$$.
$$$\cdot 1 \leq C \leq 10^9$$$.
Test Case $$$1$$$: $$$N=K=C=3$$$. In this input, we have a $$$1×3$$$ grid. Among the $$$27$$$ ways to paint the squares, there are $$$6$$$ ways to paint all squares in different colors, and the remaining $$$21$$$ ways are such that there are at most two colors in any consecutive three squares.
Test Case $$$2$$$: $$$N=10, K=5, C=2$$$: Print $$$1024$$$.
Test Case $$$3$$$: $$$N=998, K=244, C=353$$$: Print $$$952364159$$$.
Part3: Idea
(1) What are GFs good at? Gf is good at solving partitions, for example, the Pentagonal number theorem. So, the first step is to compress the colors by Run-Length Encoding (RLE). For example, if the colors are $$$(1,1,1,2,2,3,2,2)$$$, then they are uniquely compressed to $$$((1, 3), (2, 2), (3, 1), (2, 2))$$$. With RLE, $$$[1, n]$$$ is partitioned into $$$l$$$ segments with different colors. Let me denote these segments as $$$S_1, S_2, ..., S_l$$$. Please note that adjacent segments must be painted with different colors, otherwise you can merge the adjacent segments into one, which violates the definition of RLE. The idea to divide the interval into segments for counting also appears in Pinely Round Problem D.
(2)Consider $$$2 \leq i \leq l-1$$$. If $$$|S_i| \leq K-2$$$, then $$$S_{i+1}$$$ only has one choice: Paint it with the same color as $$$S_{i-1}$$$. Otherwise, the last element of $$$S_{i-1}$$$, the whole segment $$$S_{i}$$$ and the first element of $$$S_{i+1}$$$ will form an interval with size $$$\leq K$$$ and three colors, violating the rule. If $$$|S_i| > K-2$$$, then $$$S_{i+1}$$$ has $$$C-1$$$ choices. It is only required that the color of $$$S_{i+1}$$$ is different from that of $$$S_{i}$$$.
(3)I claim that the generating function for segmentation of length $$$l$$$ is:
$$$f(x, l) := C(C-1)(\sum\limits_{j=1}^\infty x^j)^2(\sum\limits_{j=1}^{K-2}x^j + \sum\limits_{j=K-1}^\infty (C-1)x^j)^{l-2} \tag{1}$$$.
The $$$C$$$ is because the first segment has $$$C$$$ choices.
The $$$C-1$$$ is because the second segment always has $$$C-1$$$ choices.
The $$$(\sum\limits_{j=1}^{K-2}x^j + \sum\limits_{j=K-1}^\infty (C-1)x^j)^{l-2}$$$ contain two parts: $$$x^j$$$ encodes the length of segment $$$i$$$ ($$$2 \leq i \leq i-1$$$). $$$1$$$, and $$$C-1$$$ encode the transfer contribution between $$$i \rightarrow i+1$$$. If $$$len(S_i) \leq K-2$$$, then $$$S_{i+1}$$$ has only 1 choice (See (2)). Otherwise, $$$S_{i+1}$$$ has $$$C-1$$$ choices.
Now we have dealt with the length of $$$S_i (2 \leq i \leq l-1)$$$ and the transfer contribution between $$$i$$$ and $$$i+1$$$ ($$$2 \leq i \leq l-1$$$). But we still omit two things: The length of head and tail! So we encode each of them with $$$\sum\limits_{j=1}^\infty x^j$$$. See the below picture:
$$$\sum\limits_{j=1}^\infty x^j=\frac{x}{1-x} \tag{2}$$$.
$$$\sum\limits_{j=1}^{K-2}x^j + \sum\limits_{j=K-1}^\infty (C-1)x^j = \frac{x-x^{K-1}}{1-x} + (C-1)\frac{x^{K-1}}{1-x} = \frac{x+(C-2)x^{K-1}}{1-x} \tag{3}$$$.
And put $$$(1), (2), (3)$$$ together, $$$f(x, l)=C(C-1)\frac{x^2}{(1-x)^2}(\frac{x+(C-2)x^{K-1}}{1-x})^{l-2}$$$.