Find the length of the longest common substring with queries

→ Pay attention

Before contest
Codeforces Round (Div. 3)
4 days

→ Streams

By Shayan

Before stream 26:03:49

View all →

→ Top rated

#	User	Rating
1	tourist	3856
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3462
9	ecnerwala	3451
10	heuristica	3431

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	cry	167
2	-is-this-fft-	162
3	Dominater069	160
4	Um_nik	158
5	atcoder_official	156
6	Qingyu	152
6	djm03178	152
6	adamant	152
9	luogu_official	149
10	awoo	147

View all →

→ Find user

→ Recent actions

Detailed →

Mr.Awesome's blog

Find the length of the longest common substring with queries

By Mr.Awesome, history, 5 years ago, In English

Here's a problem I crossed lately:

You have a large String S, and q queries, each query consist of a small string b.

The answer of each query is to find the length of the longest common substring between S and b. ( |S| <= 10^5, |b| <= 100, q <= 100 )

My dp solution to find the length of the largest LCS is O(n*m) per query, but it doesn't seem to pass!

I think will need to do some pre-prcessing of S before starting quering, but I can't find a solution.

Any hint or idea will be appreciated.

#strings, substring, #lcs, #hashing

Mr.Awesome
5 years ago
9

Comments (9)

Write comment?

eulerkochy

5 years ago, # |

I guess it's a standard application of Suffix Trees. Build a suffix tree for S$b# and then query for the LCS in O(|S|+|b|) for each query. And that should be fast enough, given the constraints.

Read more about it here

→ Reply

Not-Afraid

5 years ago, # |

You can use hashing. Let the original string be s.
First precompute and store hash (in a 2d vector let's say) for every substring of (size from $$$1$$$ to $$$100$$$) string s. Now for every query hash the current string, and look any of it's substring hash is present in precomputed vector of it's size.

→ Reply

Mr.Awesome

5 years ago, # ^ |

Thanks for your reply, but isn't the complexity of this is 100 * |S| * (complexity of hashing each substring) ? Is this a right code for your solution ?

for(int i = 0; i < S.length(); i++) {
    string tmp = "";
    for(int j = i; j < min(S.length(), i + 100); j++) {
        tmp += s[j];
        hash_and_add_to_vector(tmp);
     }
}

→ Reply

Not-Afraid

5 years ago, # ^ |

← Rev. 2 →

Спойлер

const int N = 1e5 + 5;
vector<int> v[N];
char s[N];
// step 1) First precalculate hash of string s;

// step 2) Storing hash of every string of size atmost 100
// almost 1e7 operations here 
for (int i = 1; i <= 1e5; ++i) { // consider 1-based indexing
  for (int j = i; j > 0 && j >= i - 100; --j) {
    int size = i - j + 1;
    // hash (j,i) is a function which gives hash of substring s[j..i] which is O(1)
    v[size].push_back(hash(j,i));
  }
}

// q * k * k
// a string of size k will have almost k^2 substring
for (int i = 1; i <= q; ++i) {
  cin >> curr_string;
  //step 1) Compute it's hash
  
  //step 2) 
  for (int j = 0; j < curr_string.size(); ++j) {
    for (int k = j; k < curr_string.size(); ++k) {
      int cur_size = k - j + 1;
      int has = hash(j,k);
      if (has is present in v[cur_size]) {
        // Answer is found
      }
    }
  }
}

→ Reply