imaking's blog

By imaking, history, 5 years ago, In English

We have a two strings: A and B. We need to find miniumum number of times that we could concatenate B to get A. Also we can omit any number of letters from B in any of the concatenations if we want. Example:

A = "ZAZA"
B = "BAZ"

A is obtained from 3 copies of B A= (--Z)(-AZ)(-A-)

Can you suggest any algorithmic approach which is better than O(nm) where n is length of A and m is length of B.

  • Vote: I like it
  • +4
  • Vote: I do not like it

»
5 years ago, # |
Rev. 4   Vote: I like it +2 Vote: I do not like it

O(n*26) solution exists where 26 = size of the alphabet

»
5 years ago, # |
  Vote: I like it +15 Vote: I do not like it

It's identical to this problen

»
5 years ago, # |
  Vote: I like it -13 Vote: I do not like it

First, concatenate B until it reaches the length of 2 * A. Then use a rolling hash to check the validity of each "starting point" and take the minimum of all valid positions. Total time is O(N).

»
5 years ago, # |
  Vote: I like it +36 Vote: I do not like it

One way would be to use a greedy approach letter for letter of A. Keep a pointer of where you currently are at string B. Then look for the first appearance of the current letter you're searching which is to the right of this pointer. When the pointer loops around B, that's one more concatenation you'll need. For example:

Current letter in A  |  Pointer in B  |  Number of concatenations
-----------------------------------------------------------------
         -           |       0        |            1             
         Z           |       2        |            1             
         A           |       1        |            2             
         Z           |       2        |            2             
         A           |       1        |            3             

If you store the indexes where each letter appears in multiple sets, then you can find the next pointer in B with a lower_bound query, giving an O(n log m) time solution with O(m) memory. Another way would be to preprocess a table containing, for each letter and each index of B, where is its next appearance, giving an O(m * sigma) time for preprocessing (sigma being the size of the alphabet), O(n) time for solving and O(m * sigma) memory. This probably can be optimized further...

  • »
    »
    5 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    This is exactly how I have solved this question during my onsite interview

    I was told it was optimal enough for the problem :D

    I was asked to create a function capable of saying if it was possible to reach string A from B, the follow up question was the minimum steps to reach that, like OP asked.