Before Reading...
- Sorry for my poor English skill.
- The propose of this article is "EASY" introduction to Kitamasa, so strict proofs can be omitted.
What is Kitamasa Method?
Kitamasa Method can compute N-th term of linear recurrence ($$$A_N = c_1A_{N-1} + c_2A_{N-2} + \cdots + c_KA_{N-K} = \sum_{i=1}^{K} c_iA_{N-i}$$$) in $$$O(K^2 \log N)$$$ or $$$O(K \log K \log N)$$$ when you manipulate polynomial with FFT.
Let's transform the equation
The goal of Kitamasa is find $$$d_i$$$, such that $$$A_N = \sum_{i=1}^{K} c_iA_{N-i} = \sum_{i=0}^{K-1} d_iA_i$$$. In other word, we can compute N-th term using first K elements instead of before K elements. If we can get $$$d_i$$$ in $$$T(N, K)$$$, we can compute N-th term in $$$O(T(N, K) + K)$$$ time.
We can find $$$d_i$$$ just using $$$A_i \leftarrow (c_1A_{i-1} + c_2A_{i-2} + \cdots)$$$ . For example, let's think about $$$A_N = 2A_{N-1} + A_{N-2}$$$. ($$$c_1 = 2, c_2 = 1$$$).
- $$$A_5 = 2A_4 + A_3$$$
- $$$A_5 = 2(2A_3 + A_2) + A_3 = 5A_3 + 2A_2$$$
- $$$A_5 = 5(2A_2 + A_1) + 2A_2 = 12A_2 + 5A_1$$$
- $$$A_5 = 12(2A_1 + A_0) + 5A_1 = 29A_1 + 12A_0$$$
- Finally, when $$$N = 5$$$ then $$$d_0 = 12, d_1 = 29$$$.
Actually, this process is subtracting $$$c(A_x - c_1A_{x-1} - c_2A_{x-2} - \cdots) = 0$$$ from both sides. In this example, we subtract $$$c(A_x - 2A_{x-1} - A_{x-2}) = 0$$$.
- $$$A_5 = 2A_4 + A_3$$$, Let's subtract $$$2(A_4 - 2A_3 - A_2) = 0$$$
- $$$A_5 = 5A_3 + 2A_2$$$, Let's subtract $$$5(A_3 - 2A_2 - A_1) = 0$$$
- $$$A_5 = 12A_2 + 5A_1$$$, Let's subtract $$$12(A_2 - 2A_1 - A_0) = 0$$$
- Finally, $$$A_5 = 29A_1 + 12A_0$$$
It is the same process as finding $$$x^N \mod f(x)$$$ when $$$f(x) = x^K - c_1x^{K-1} - c_2x^{K-2} - \cdots - c_Kx^0 = x^K - \sum_{i=1}^{K}c_ix^{K-i}$$$. For example, $$$x^5 = (x^2-2x-1)(x^3+2x^2+5x+12) + (29x + 12)$$$.
From now, we just calculate $$$x^N \mod f(x)$$$.
$$$x^N$$$?
We can't handle $$$x^N$$$ when $$$N$$$ is very large ($$$N \geq 10^9$$$). But we can calculate $$$x^N$$$ from $$$x^1, x^2, x^4, x^8, \cdots$$$ and it need $$$O(\log N)$$$ multiply operations.
Because the goal of this algorithm is get $$$x^N \mod f(x)$$$, we calculate it from $$$x^1 \mod f(x), x^2 \mod f(x), x^4 \mod f(x), \cdots$$$. We should multply/divide polynomial $$$O(\log N)$$$ time, and degree of each polynomial is $$$K-1$$$.
Let's see the pseudocode of Kitamasa:
using ll = long long;
using poly = vector<ll>;
ll kitamasa(poly c, poly a, ll n){
poly d = {1}; // result
poly xn = {0, 1}; // xn = x^1, x^2, x^4, ...
poly f(c.size()+1); // f(x) = x^K - \sum c_{i}x^{i}
f.back() = 1;
for(int i=0; i<c.size(); i++) f[i] = -c[i];
while(n){
if(n & 1) d = Div(Mul(d, xn), f);
n >>= 1; xn = Div(Mul(xn, xn), f);
}
ll ret = 0;
for(int i=0; i<a.size(); i++) ret += a[i] * d[i];
return ret;
}
$$$O(K^2 \log N)$$$ Kitamasa
When we calculate multiply/divide polynomial naively, time complexity of Kitamasa is $$$O(K^2 \log N)$$$.
int Mod(ll x){
return (x %= MOD) < 0 ? x + MOD : x;
}
poly Mul(const poly &a, const poly &b){
poly ret(a.size() + b.size() - 1);
for(int i=0; i<a.size(); i++) for(int j=0; j<b.size(); j++) {
ret[i+j] = (ret[i+j] + a[i] * b[j]) % MOD;
}
return ret;
}
poly Div(const poly &a, const poly &b){
poly ret(all(a));
for(int i=ret.size()-1; i>=b.size()-1; i--) for(int j=0; j<b.size(); j++) {
ret[i+j-b.size()+1] = Mod(ret[i+j-b.size()+1] - ret[i] * b[j]);
}
ret.resize(b.size()-1);
return ret;
}
/// A_{n} = \sum c_{i}A_{n-i} = \sum d_{i}A_{i}
ll kitamasa(poly c, poly a, ll n){
poly d = {1}; // result
poly xn = {0, 1}; // xn = x^1, x^2, x^4, ...
poly f(c.size()+1); // f(x) = x^K - \sum c_{i}x^{i}
f.back() = 1;
for(int i=0; i<c.size(); i++) f[i] = Mod(-c[i]);
while(n){
if(n & 1) d = Div(Mul(d, xn), f);
n >>= 1; xn = Div(Mul(xn, xn), f);
}
ll ret = 0;
for(int i=0; i<a.size(); i++) ret = Mod(ret + a[i] * d[i]);
return ret;
}
int main(){
// calculate N-th Fibonacci number
poly rec = {1, 1};
poly dp = {0, 1};
ll N; cin >> N;
cout << kitamasa(rec, dp, N);
}
$$$O(K \log K \log N)$$$ Kitamasa
With FFT, we can multiply/divide polynomial in $$$O(K \log K)$$$. (https://cp-algorithms.com/algebra/polynomial.html) So time complexity of Kitamasa is $$$O(K \log K \log N)$$$.
This is my solution for Codechef RNG : https://github.com/justiceHui/AlgorithmImplement/blob/master/Math/Fast-Kitamasa.cpp