### Before Reading...↵
↵
1. Sorry for my poor English skill.↵
2. The propose of this article is "EASY" introduction to Kitamasa, so strict proofs can be omitted.↵
↵
### What is Kitamasa Method?↵
↵
Kitamasa Method can compute N-th term of linear recurrence ($A_N = c_1A_{N-1} + c_2A_{N-2} + \cdots + c_KA_{N-K} = \sum_{i=1}^{K} c_iA_{N-i}$) in $O(K^2 \log N)$ or $O(K \log K \log N)$ when you manipulate polynomial with FFT.↵
↵
### Let's transform the equation↵
↵
The goal of Kitamasa is find $d_i$, such that $A_N = \sum_{i=1}^{K} c_iA_{N-i} = \sum_{i=0}^{K-1} d_iA_i$. In other word, we can compute N-th term using **first K elements** instead of **before K elements**. If we can get $d_i$ in $T(N, K)$, we can compute N-th term in $O(T(N, K) + K)$ time.↵
↵
We can find $d_i$ just using $A_i \leftarrow (c_1A_{i-1} + c_2A_{i-2} + \cdots)$ . For example, let's think about $A_N = 2A_{N-1} + A_{N-2}$. ($c_1 = 2, c_2 = 1$).↵
↵
* $A_5 = 2A_4 + A_3$↵
* $A_5 = 2(2A_3 + A_2) + A_3 = 5A_3 + 2A_2$↵
* $A_5 = 5(2A_2 + A_1) + 2A_2 = 12A_2 + 5A_1$↵
* $A_5 = 12(2A_1 + A_0) + 5A_1 = 29A_1 + 12A_0$↵
* Finally, when $N = 5$ then $d_0 = 12, d_1 = 29$.↵
↵
Actually, this process is subtracting $c(A_x - c_1A_{x-1} - c_2A_{x-2} - \cdots) = 0$ from both sides. In this example, we subtract $c(A_x - 2A_{x-1} - A_{x-2}) = 0$.↵
↵
* $A_5 = 2A_4 + A_3$, Let's subtract $2(A_4 - 2A_3 - A_2) = 0$↵
* $A_5 = 5A_3 + 2A_2$, Let's subtract $5(A_3 - 2A_2 - A_1) = 0$↵
* $A_5 = 12A_2 + 5A_1$, Let's subtract $12(A_2 - 2A_1 - A_0) = 0$↵
* Finally, $A_5 = 29A_1 + 12A_0$↵
↵
It is the same process as finding $x^N \mod f(x)$ when $f(x) = x^K - c_1x^{K-1} - c_2x^{K-2} - \cdots - c_Kx^0 = x^K - \sum_{i=1}^{K}c_ix^{K-i}$. For example, $x^5 = (x^2-2x-1)(x^3+2x^2+5x+12) + (29x + 12)$.↵
↵
From now, we just calculate $x^N \mod f(x)$.↵
↵
### $x^N$?↵
↵
We can't handle $x^N$ when $N$ is very large ($N \geq 10^9$). But we can calculate $x^N$ from $x^1, x^2, x^4, x^8, \cdots$ and it need $O(\log N)$ multiply operations.↵
↵
Because the goal of this algorithm is get $x^N \mod f(x)$, we calculate it from $x^1 \mod f(x), x^2 \mod f(x), x^4 \mod f(x), \cdots$. We should multply/divide polynomial $O(\log N)$ time, and degree of each polynomial is $K-1$.↵
↵
Let's see the pseudocode of Kitamasa:↵
↵
```cpp↵
using ll = long long;↵
using poly = vector<ll>;↵
ll kitamasa(poly c, poly a, ll n){↵
poly d = {1}; // result↵
poly xn = {0, 1}; // xn = x^1, x^2, x^4, ...↵
poly f(c.size()+1); // f(x) = x^K - \sum c_{i}x^{i}↵
f.back() = 1;↵
for(int i=0; i<c.size(); i++) f[i] = -c[i];↵
while(n){↵
if(n & 1) d = Div(Mul(d, xn), f);↵
n >>= 1; xn = Div(Mul(xn, xn), f);↵
}↵
↵
ll ret = 0;↵
for(int i=0; i<a.size(); i++) ret += a[i] * d[i];↵
return ret;↵
}↵
```↵
↵
### $O(K^2 \log N)$ Kitamasa↵
↵
When we calculate multiply/divide polynomial naively, time complexity of Kitamasa is $O(K^2 \log N)$.↵
↵
```cpp↵
int Mod(ll x){↵
return (x %= MOD) < 0 ? x + MOD : x;↵
}↵
↵
poly Mul(const poly &a, const poly &b){↵
poly ret(a.size() + b.size() - 1);↵
for(int i=0; i<a.size(); i++) for(int j=0; j<b.size(); j++) {↵
ret[i+j] = (ret[i+j] + a[i] * b[j]) % MOD;↵
}↵
return ret;↵
}↵
↵
poly Div(const poly &a, const poly &b){↵
poly ret(all(a));↵
for(int i=ret.size()-1; i>=b.size()-1; i--) for(int j=0; j<b.size(); j++) {↵
ret[i+j-b.size()+1] = Mod(ret[i+j-b.size()+1] - ret[i] * b[j]);↵
}↵
ret.resize(b.size()-1);↵
return ret;↵
}↵
↵
/// A_{n} = \sum c_{i}A_{n-i} = \sum d_{i}A_{i}↵
ll kitamasa(poly c, poly a, ll n){↵
poly d = {1}; // result↵
poly xn = {0, 1}; // xn = x^1, x^2, x^4, ...↵
poly f(c.size()+1); // f(x) = x^K - \sum c_{i}x^{i}↵
f.back() = 1;↵
for(int i=0; i<c.size(); i++) f[i] = Mod(-c[i]);↵
while(n){↵
if(n & 1) d = Div(Mul(d, xn), f);↵
n >>= 1; xn = Div(Mul(xn, xn), f);↵
}↵
↵
ll ret = 0;↵
for(int i=0; i<a.size(); i++) ret = Mod(ret + a[i] * d[i]);↵
return ret;↵
}↵
↵
int main(){↵
// calculate N-th Fibonacci number↵
poly rec = {1, 1};↵
poly dp = {0, 1};↵
ll N; cin >> N;↵
cout << kitamasa(rec, dp, N);↵
}↵
```↵
↵
### $O(K \log K \log N)$ Kitamasa↵
↵
With FFT, we can multiply/divide polynomial in $O(K \log K)$. ([https://cp-algorithms.com/algebra/polynomial.html](https://cp-algorithms.com/algebra/polynomial.html)) So time complexity of Kitamasa is $O(K \log K \log N)$.↵
↵
This is my solution for Codechef RNG : [https://github.com/justiceHui/AlgorithmImplement/blob/master/Math/Fast-Kitamasa.cpp](https://github.com/justiceHui/AlgorithmImplement/blob/master/Math/Fast-Kitamasa.cpp)↵
↵
↵
1. Sorry for my poor English skill.↵
2. The propose of this article is "EASY" introduction to Kitamasa, so strict proofs can be omitted.↵
↵
### What is Kitamasa Method?↵
↵
Kitamasa Method can compute N-th term of linear recurrence ($A_N = c_1A_{N-1} + c_2A_{N-2} + \cdots + c_KA_{N-K} = \sum_{i=1}^{K} c_iA_{N-i}$) in $O(K^2 \log N)$ or $O(K \log K \log N)$ when you manipulate polynomial with FFT.↵
↵
### Let's transform the equation↵
↵
The goal of Kitamasa is find $d_i$, such that $A_N = \sum_{i=1}^{K} c_iA_{N-i} = \sum_{i=0}^{K-1} d_iA_i$. In other word, we can compute N-th term using **first K elements** instead of **before K elements**. If we can get $d_i$ in $T(N, K)$, we can compute N-th term in $O(T(N, K) + K)$ time.↵
↵
We can find $d_i$ just using $A_i \leftarrow (c_1A_{i-1} + c_2A_{i-2} + \cdots)$ . For example, let's think about $A_N = 2A_{N-1} + A_{N-2}$. ($c_1 = 2, c_2 = 1$).↵
↵
* $A_5 = 2A_4 + A_3$↵
* $A_5 = 2(2A_3 + A_2) + A_3 = 5A_3 + 2A_2$↵
* $A_5 = 5(2A_2 + A_1) + 2A_2 = 12A_2 + 5A_1$↵
* $A_5 = 12(2A_1 + A_0) + 5A_1 = 29A_1 + 12A_0$↵
* Finally, when $N = 5$ then $d_0 = 12, d_1 = 29$.↵
↵
Actually, this process is subtracting $c(A_x - c_1A_{x-1} - c_2A_{x-2} - \cdots) = 0$ from both sides. In this example, we subtract $c(A_x - 2A_{x-1} - A_{x-2}) = 0$.↵
↵
* $A_5 = 2A_4 + A_3$, Let's subtract $2(A_4 - 2A_3 - A_2) = 0$↵
* $A_5 = 5A_3 + 2A_2$, Let's subtract $5(A_3 - 2A_2 - A_1) = 0$↵
* $A_5 = 12A_2 + 5A_1$, Let's subtract $12(A_2 - 2A_1 - A_0) = 0$↵
* Finally, $A_5 = 29A_1 + 12A_0$↵
↵
It is the same process as finding $x^N \mod f(x)$ when $f(x) = x^K - c_1x^{K-1} - c_2x^{K-2} - \cdots - c_Kx^0 = x^K - \sum_{i=1}^{K}c_ix^{K-i}$. For example, $x^5 = (x^2-2x-1)(x^3+2x^2+5x+12) + (29x + 12)$.↵
↵
From now, we just calculate $x^N \mod f(x)$.↵
↵
### $x^N$?↵
↵
We can't handle $x^N$ when $N$ is very large ($N \geq 10^9$). But we can calculate $x^N$ from $x^1, x^2, x^4, x^8, \cdots$ and it need $O(\log N)$ multiply operations.↵
↵
Because the goal of this algorithm is get $x^N \mod f(x)$, we calculate it from $x^1 \mod f(x), x^2 \mod f(x), x^4 \mod f(x), \cdots$. We should multply/divide polynomial $O(\log N)$ time, and degree of each polynomial is $K-1$.↵
↵
Let's see the pseudocode of Kitamasa:↵
↵
```cpp↵
using ll = long long;↵
using poly = vector<ll>;↵
ll kitamasa(poly c, poly a, ll n){↵
poly d = {1}; // result↵
poly xn = {0, 1}; // xn = x^1, x^2, x^4, ...↵
poly f(c.size()+1); // f(x) = x^K - \sum c_{i}x^{i}↵
f.back() = 1;↵
for(int i=0; i<c.size(); i++) f[i] = -c[i];↵
while(n){↵
if(n & 1) d = Div(Mul(d, xn), f);↵
n >>= 1; xn = Div(Mul(xn, xn), f);↵
}↵
↵
ll ret = 0;↵
for(int i=0; i<a.size(); i++) ret += a[i] * d[i];↵
return ret;↵
}↵
```↵
↵
### $O(K^2 \log N)$ Kitamasa↵
↵
When we calculate multiply/divide polynomial naively, time complexity of Kitamasa is $O(K^2 \log N)$.↵
↵
```cpp↵
int Mod(ll x){↵
return (x %= MOD) < 0 ? x + MOD : x;↵
}↵
↵
poly Mul(const poly &a, const poly &b){↵
poly ret(a.size() + b.size() - 1);↵
for(int i=0; i<a.size(); i++) for(int j=0; j<b.size(); j++) {↵
ret[i+j] = (ret[i+j] + a[i] * b[j]) % MOD;↵
}↵
return ret;↵
}↵
↵
poly Div(const poly &a, const poly &b){↵
poly ret(all(a));↵
for(int i=ret.size()-1; i>=b.size()-1; i--) for(int j=0; j<b.size(); j++) {↵
ret[i+j-b.size()+1] = Mod(ret[i+j-b.size()+1] - ret[i] * b[j]);↵
}↵
ret.resize(b.size()-1);↵
return ret;↵
}↵
↵
/// A_{n} = \sum c_{i}A_{n-i} = \sum d_{i}A_{i}↵
ll kitamasa(poly c, poly a, ll n){↵
poly d = {1}; // result↵
poly xn = {0, 1}; // xn = x^1, x^2, x^4, ...↵
poly f(c.size()+1); // f(x) = x^K - \sum c_{i}x^{i}↵
f.back() = 1;↵
for(int i=0; i<c.size(); i++) f[i] = Mod(-c[i]);↵
while(n){↵
if(n & 1) d = Div(Mul(d, xn), f);↵
n >>= 1; xn = Div(Mul(xn, xn), f);↵
}↵
↵
ll ret = 0;↵
for(int i=0; i<a.size(); i++) ret = Mod(ret + a[i] * d[i]);↵
return ret;↵
}↵
↵
int main(){↵
// calculate N-th Fibonacci number↵
poly rec = {1, 1};↵
poly dp = {0, 1};↵
ll N; cin >> N;↵
cout << kitamasa(rec, dp, N);↵
}↵
```↵
↵
### $O(K \log K \log N)$ Kitamasa↵
↵
With FFT, we can multiply/divide polynomial in $O(K \log K)$. ([https://cp-algorithms.com/algebra/polynomial.html](https://cp-algorithms.com/algebra/polynomial.html)) So time complexity of Kitamasa is $O(K \log K \log N)$.↵
↵
This is my solution for Codechef RNG : [https://github.com/justiceHui/AlgorithmImplement/blob/master/Math/Fast-Kitamasa.cpp](https://github.com/justiceHui/AlgorithmImplement/blob/master/Math/Fast-Kitamasa.cpp)↵
↵