t3rminated's blog

By t3rminated, history, 8 years ago, In English

I was reading about algorithms to find LCS in O(nlogn) and i came through this research paper — link

It claims to find LCS in O((n+r)logn) where r are the number of matching pairs.

But I can't understand how the final algorithm is O(n+rlogn) , I guess the algorithm is O(n*r*logn) ,please clear my doubt I can't understand?

here's my code —

#include "bits/stdc++.h"
using namespace std;
char a[50001];
char b[50001];
int thresh[50001];
int freq[27];
int freq1[27][50001];

int main()
{
	scanf("%s",a+1);
	scanf("%s",b+1);
	
	int c = 0 ;
	for(int i = strlen(b+1); i >= 1; i--)
	{
		freq1[b[i]-'a'][freq[b[i]-'a']] = i;
		freq[b[i]-'a']++;
	}
	
	fill(thresh,thresh+max(strlen(a+1)+1,strlen(b+1)+1),10000000);
	thresh[0] = 0;
	
	int N = max(strlen(a+1)+1,strlen(b+1)+1);
	
	for(int i = 1; i <= strlen(a+1); i++)
	{
		for(int j = 0; j < freq[a[i] - 'a']; j++)
		{
			int k = lower_bound(thresh,thresh + N, freq1[a[i] - 'a'][j]) - thresh;
			if(freq1[a[i] - 'a'][j] < thresh[k])
			{
				thresh[k] = freq1[a[i] - 'a'][j];
			}
		}}
	
	int max1 = -1;
	for(int i = 0; i < N; i++){
		if(thresh[i] != 10000000)
			max1 = max(max1, thresh[i]);
	}
	printf("LCS is \n");
	printf("%d",max1);
	return 0;
}
  • Vote: I like it
  • -1
  • Vote: I do not like it

| Write comment?
»
8 years ago, # |
  Vote: I like it 0 Vote: I do not like it

link broken :(

»
8 years ago, # |
  Vote: I like it 0 Vote: I do not like it

The abstract itself says that the worst case is n^2 log n. I wouldn't consider it suitable for contest, although you could try it if you are out of ideas. StackOverflow (http://stackoverflow.com/questions/30768610/finding-longest-common-subsequence-in-onlogn-time) says that the DP is the only solution that is completely general.

  • »
    »
    8 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Wikipedia says there exists a O((n+r)login)

    • »
      »
      »
      8 years ago, # ^ |
      Rev. 3   Vote: I like it 0 Vote: I do not like it

      The abstract says: "where r is the total number of ordered pairs of positions at which the two sequences match." That's n^2 at most. Therefore the algorithm is n^2 log n in worst case.