Linear Recurrence and Berlekamp-Massey Algorithm

Revision en3, by TLE, 2018-08-19 10:57:46

#IjustWantContribution

It seems there isn't any blog about Berlekamp-Massey Algorithm around here, so I decided to go on a try. :P

Acknowledgement: Hats off to matthew99 for introducing this algorithm.

What is 'linear recurrence'?

Assuming there is a (probably infinity) sequence a0, a1...an - 1, we call this sequence satisfies a linear recurrence relation p1, p2...pm, iff . (Obviously, if m ≥ n any p can do :P)

How to calculate k-th term of a linear recurrence? (Bonus)

For a polynomial , we define .

Obviously G satisfies G(f) ± G(g) = G(f ± g).

Because , if we let , then G(f) = 0. Also G(fx), G(fx2)... = 0. So G(fg) = 0 (g is any polynomial).

What we want is G(xk). Because G(fxk / f⌋) = 0, then .

We can calculate in a binary-exponentiation manner, then calculate . Time complexity is or (if you use fft etc.)

How to find (the shortest) linear recurrence relation?

It's Berlekamp-Massey Algorithm to the rescue! For a given sequence x0, x1...xn - 1, it can calculate one shortest linear recurrence relation for every prefix in O(n2) time.

Let's define the value of a relation sequence p1, p2...pm evaluated at position t: (t ≥ m). A valid linear recurrence relation is a relation sequence with correct value evaluated at every position  ≥ m.

Let's consider the numbers from left to right. Start from {}, we evaluate the current relation sequence at current position t (from 1 to n). If we got at, then it's still good, go on. Assume we've got value v, if we somehow got some relation sequence x that evaluated as 1 at position t, and evaluated as 0 (or undefined) at positions  < t, then minus current sequence with (v - at)x, we're done.

If this is not first non-zero position, we have run into this situation before. Let's say s = {s1, s2...sm} evaluated as xt' + v' at position t' and correct at positions before t', then {1,  - s1,  - s2... - sm} should evaluated as v' at position t' + 1 and 0 otherwise. Divide it with v' and add proper (t - t' - 1) zeroes in front, we've got the x we need!

If we run into this situation several times before, we can choose the one that is shortest after filling zeroes.

a sample (in case you didn't understand clearly)
A not-so-elegant proof-of-concept

Combine the above two section, we can acquire a handy weapon for these kind of problems :)

#include <bits/stdc++.h>
using namespace std;
#define pb push_back
typedef long long ll;
#define SZ 233333
const int MOD=1e9+7;
ll qp(ll a,ll b)
{
	ll x=1; a%=MOD;
	while(b)
	{
		if(b&1) x=x*a%MOD;
		a=a*a%MOD; b>>=1;
	}
	return x;
}
namespace linear_seq {
inline vector<int> BM(vector<int> x)
{
	vector<int> ls,cur; int lf,ld;
	for(int i=0;i<int(x.size());++i)
	{
		ll t=-x[i]%MOD;
		for(int j=0;j<int(cur.size());++j)
			t=(t+x[i-j-1]*(ll)cur[j])%MOD;
		if(!t) continue;
		if(!cur.size()) {cur.resize(i+1); lf=i; ld=t; continue;}
		ll k=-t*qp(ld,MOD-2)%MOD;
		vector<int> c(i-lf-1); c.pb(-k);
		for(int j=0;j<int(ls.size());++j) c.pb(ls[j]*k%MOD);
		if(c.size()<cur.size()) c.resize(cur.size());
		for(int j=0;j<int(cur.size());++j)
			c[j]=(c[j]+cur[j])%MOD;
		if(i-lf+(int)ls.size()>=(int)cur.size())
			ls=cur,lf=i,ld=t;
		cur=c;
	}
	vector<int>&o=cur;
	for(int i=0;i<int(o.size());++i)
		o[i]=(o[i]%MOD+MOD)%MOD;
	return o;
}
int N; ll a[SZ],h[SZ],t_[SZ],s[SZ],t[SZ];
inline void mull(ll*p,ll*q)
{
	for(int i=0;i<N+N;++i) t_[i]=0;
	for(int i=0;i<N;++i) if(p[i])
		for(int j=0;j<N;++j)
			t_[i+j]=(t_[i+j]+p[i]*q[j])%MOD;
	for(int i=N+N-1;i>=N;--i) if(t_[i])
		for(int j=N-1;~j;--j)
			t_[i-j-1]=(t_[i-j-1]+t_[i]*h[j])%MOD;
	for(int i=0;i<N;++i) p[i]=t_[i];
}
inline ll calc(ll K)
{
	for(int i=N;~i;--i) s[i]=t[i]=0;
	s[0]=1; if(N!=1) t[1]=1; else t[0]=h[0];
	for(;K;mull(t,t),K>>=1) if(K&1) mull(s,t); ll su=0;
	for(int i=0;i<N;++i) su=(su+s[i]*a[i])%MOD;
	return (su%MOD+MOD)%MOD;
}
inline int work(vector<int> x,ll n)
{
	if(n<int(x.size())) return x[n];
	vector<int> v=BM(x); N=v.size(); if(!N) return 0;
	for(int i=0;i<N;++i) h[i]=v[i],a[i]=x[i];
	return calc(n);
}
}
using linear_seq::work;
int main()
{
	cout<<work({1,1,2,3,5,8,13,21},10)<<"\n";
}

Applications

Or, in other words, where can we find linear recurrences?

From the point of generating function, let A and P be the generating function of a and p, then A = AP + A0 (A0 depends on the first terms of a), then A = A0 / (1 - P). Moreover, if A = B / C and the constant term of C is 1 then there is a linear recurrence relation for a. So, provided with the generating function of a, one can tell if it's a linear recurrence easily.

If we have some kind of dynamic-programming f[i][j] (i ≤ n, j ≤ m), we want to find f[n][1]. The transitions of f is something like . In old days, we may use matrix-multiplications. But things have changed! Calculate f[1][1], f[2][1]...f[m + m + m][1] and plug in the above code, we're done!

Why? Consider f[i] as a vector and v as a matrix, then f[i] = f[i - 1]v, so f[n] = f[1]vn - 1. Consider the minimal polynomial of v, it's degree must be  ≤ m and obviously there's a corresponding linear recurrence relation with length  ≤ m. With a prefix of length m + m + m it's enough to figure out a correct relation.

Why is it better than matrix multiplication? Besides it's instead of , sometimes it's hard to acquire the exact transition matrix (or maybe just you're lazy enough), and this algorithm makes life better.

Try your hands

http://codeforces.net/contest/506/problem/E Write a naive dynamic-programming for small n, plug in BM, you're done! Life has never been so easy.

https://loj.ac/problem/2463 A chinese problem: Let n be a power of 2, you're given a directed graph with n nodes, from i to j there're Ai, j directed edges. Little Q is going on several trips, for every trip he will start from some node, make at least one step (i.e. go through at least one edge) and end at some node. He is wondering the number of ways if he's going on several travels, making x steps at total, and the bitwise-and of all start nodes and end nodes equals to y. For every , , you need to find the way modulo 998244353. To reduce output size, only output the bitwise-xor of all m × n answers. 2 ≤ n ≤ 64, 1 ≤ m ≤ 20000.

There're many more problems that can be solved in this way, but since posting them here is already spoiling I'm not going to post more :)

Tags #algorithms

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en9 English TLE 2020-04-01 10:43:44 8
en8 English TLE 2020-04-01 10:42:28 4
en7 English TLE 2018-08-24 12:02:51 76
en6 English TLE 2018-08-20 14:06:07 1893
en5 English MikeMirzayanov 2018-08-20 09:50:59 9
en4 English TLE 2018-08-19 12:14:21 77
en3 English TLE 2018-08-19 10:57:46 28
en2 English TLE 2018-08-19 10:52:57 9095 add text (published)
en1 English TLE 2018-08-19 08:51:02 76 Initial revision (saved to drafts)