I was reading about algorithms to find LCS in O(nlogn) and i came through this research paper — link
It claims to find LCS in O((n+r)logn) where r are the number of matching pairs.
But I can't understand how the final algorithm is O(n+rlogn) , I guess the algorithm is O(n*r*logn) ,please clear my doubt I can't understand?
here's my code —
#include "bits/stdc++.h"
using namespace std;
char a[50001];
char b[50001];
int thresh[50001];
int freq[27];
int freq1[27][50001];
int main()
{
scanf("%s",a+1);
scanf("%s",b+1);
int c = 0 ;
for(int i = strlen(b+1); i >= 1; i--)
{
freq1[b[i]-'a'][freq[b[i]-'a']] = i;
freq[b[i]-'a']++;
}
fill(thresh,thresh+max(strlen(a+1)+1,strlen(b+1)+1),10000000);
thresh[0] = 0;
int N = max(strlen(a+1)+1,strlen(b+1)+1);
for(int i = 1; i <= strlen(a+1); i++)
{
for(int j = 0; j < freq[a[i] - 'a']; j++)
{
int k = lower_bound(thresh,thresh + N, freq1[a[i] - 'a'][j]) - thresh;
if(freq1[a[i] - 'a'][j] < thresh[k])
{
thresh[k] = freq1[a[i] - 'a'][j];
}
}}
int max1 = -1;
for(int i = 0; i < N; i++){
if(thresh[i] != 10000000)
max1 = max(max1, thresh[i]);
}
printf("LCS is \n");
printf("%d",max1);
return 0;
}
link broken :(
The abstract itself says that the worst case is n^2 log n. I wouldn't consider it suitable for contest, although you could try it if you are out of ideas. StackOverflow (http://stackoverflow.com/questions/30768610/finding-longest-common-subsequence-in-onlogn-time) says that the DP is the only solution that is completely general.
Wikipedia says there exists a O((n+r)login)
The abstract says: "where r is the total number of ordered pairs of positions at which the two sequences match." That's n^2 at most. Therefore the algorithm is n^2 log n in worst case.