chirag_h's blog

By chirag_h, history, 4 years ago, In English

I was trying this problem Taklu Kuddus.

The problem gives us a string S and a pattern P and for Q queries we have to find the maximum number of non-overlapping occurrences of P in the substring of S of the given range in q.

I first tried brute force which got TLE. I tried playing with the starting and ending indexes of matched substrings but could not think of a clear strategy.

Any leads on how to approach this?

  • Vote: I like it
  • +5
  • Vote: I do not like it

»
4 years ago, # |
  Vote: I like it +4 Vote: I do not like it

Your task is similar to Consistent occurrences [gym]. My solution for it is greedily picking a sub-string whenever possible...

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Yes, solution is greedy, pick whenever you can pick. Proof:

Assume its not true, and you can get better score after skipping one occurrence. This mean that there is at least 2 non overlapping starts of occurrences inside of first occurrence. And that simply cant be true, because its mean that second occurrence has smaller length than first, but its false because its all occurrences of the same string

»
4 years ago, # |
Rev. 3   Vote: I like it +5 Vote: I do not like it

Tet brezhart Thanks for replying.

As you guys suggested, I tried using greedy with rolling-hash but still got TLE. The constraints seem to be too high for an O(Q. N) algorithm. Any hint on how to improve this? Here is the code for reference: Code

»
4 years ago, # |
Rev. 2   Vote: I like it +5 Vote: I do not like it

When there's a working greedy strategy to a problem on arrays/string, (always pick the first occurrence), we can use binary jumping to answer queries.

Let jmp[i] = the index after the end of the first occurrence of P in S, such that P starts at index i or later.

With, for example rolling hashes, these jump pointers can be calculated from right to left in O(n) preprocessing. Then we calculate all jumps of length 2^k, for a total of O(n log(n)) preprocessing (like in binary lifting in trees). When we want to answer a query on substring [l,r]. We start at l, and do a binary search with the jump pointers to find the amount of jumps you can make to still be on the left of index r. This is again very similar to finding the LCA with binary lifting in a tree. So you can solve the problem in O((n+q) log(n)). If this TLEs, we could use Binary lifting in linear time preprocessing, to get O(n + qlog(n))

Edit: Because the statement says: It is guaranted that the summation of all the queries for each test case will not exceed 200000. After we have the jump pointers we don't need any binary jumping, and can just follow the pointers brute force to get O(n+ q + 200000).

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +19 Vote: I do not like it

    Thank you very much! I think I get what you're saying, Ill try implementing it!