Short problem statement: Given T < 200 binary strings S (|S| ≤ 105), find the size of the shortest pattern that doesn't match S for all input strings.
Examples :
S = 011101001, answer = 3 (doesnt match '000')
S = 11111, answer = 1 (doesnt match '0')
My current solution is building a suffix automaton for S and searching all patterns of size i (i=1, i=2, ...) while the number of matches of this size equals 2i. When i find some k such that matches(k) < 2k, this is the answer. This is O(|S|) for building suffix automata plus for matchings, which i think is always small but still relevant.
This solution gets TLE. Can anyone help me with a faster solution for this problem? Thanks in advance.
EDIT 2: Solved the problem using the strategy below. I will leave the blog here any way because this problem looks interesting so i want to share the solution in case any one is interested :)
Run BFS in suffix automata starting from root node until we find some node that has less than 2 links. Let p be the length of the path from root to this node. This node represents some pattern of length p+1 that doesn't appear in the string S. So the answer is p+1.
Auto comment: topic has been updated by skavurskaa (previous revision, new revision, compare).
Auto comment: topic has been updated by skavurskaa (previous revision, new revision, compare).
Auto comment: topic has been updated by skavurskaa (previous revision, new revision, compare).
Can you share the link of the problem?, please. I think that I can solve the problem more easily(without suffix automaton).
URI Online Judge (single test case per file even though problem statement says otherwise): http://www.urionlinejudge.com.br/judge/en/problems/view/2093
Also in spoj (T<200 tests per file): http://spoj.com/problems/TAP2013E/
If you analyze the problem more thoroughly, you may observe the interesting fact that the length of the solution callot exceed ⌈ log(L)⌉ (why?). This way you can solve the problem much more easily (albeit in a "worse" time bound of O(L * log(L))).
Edit: Also, keep in mind that if you want to keep track of strings of small length, you can as well use a simpler "Rabin Karp" approach (hashing).
I had this feeling that the answer could never be too big! That's why my first idea was testing all small patterns using suffix automaton pattern matching utility. Unfortunately it got TLE. Maybe a simpler approach like yours, which doesn't build the SA, can pass in the time limits.
Thanks for the contribution
can we do it by binary-search + Bruteforce(to generate all binary string of particular length) + Rabin-Karp(to find a match)? but it has high time complexity which will not pass in given constraints. can you suggest any better way?
Auto comment: topic has been updated by skavurskaa (previous revision, new revision, compare).