Algorithm Wildcard Searching with *

Revision en3, by cardcounter, 2024-12-21 12:40:45

I am thinking about an effient algorithm for wildcard searching with * representing any characters with any length.

aa*c, she*he, *she*he

When I am supposed to return, say, the beginning index of the first matching instance.

Say the pattern is of length M and the document is of length N, and the pattern has K '*' signs. I can think of a solution that first uses AC Automation to find all occurences of each chunk in O(N + M), with bitmask.

While converting the bitmask to indexes takes O(N * K)

Then binary search for the last possible beginning positions for each chunk. This could take O (K log N)

So the overall time complexity is still O (N * K), any way to do better?

References

https://codeforces.net/blog/entry/111380 https://codeforces.net/problemset/problem/1023/A

Tags pattern_matching, wildcard

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en7 English cardcounter 2024-12-21 23:01:06 18
en6 English cardcounter 2024-12-21 22:58:46 229
en5 English cardcounter 2024-12-21 12:46:01 44 Tiny change: 'm/1023/A\n' -> 'm/1023/A\n\nhttps://codeforces.net/blog/entry/127169\n'
en4 English cardcounter 2024-12-21 12:41:11 2 Tiny change: 'y/111380\nhttps://' -> 'y/111380\n\nhttps://'
en3 English cardcounter 2024-12-21 12:40:45 108
en2 English cardcounter 2024-12-21 12:32:51 7
en1 English cardcounter 2024-12-21 12:30:21 746 Initial revision (published)