Algorithm Wildcard Searching with *
Difference between en5 and en6, changed 229 character(s)
I am thinking about an effient algorithm for wildcard searching with * representing any characters with any length.↵

aa*c, she*he, *she*he↵


Example:↵

1.   caa find aa*c↵

     return "DOES NOT EXIST"↵

2.   hesherheshe find she*he↵
     return 2↵

     Because sherheshe begins at index 2↵
3. sherheshe find *she*he*↵
     return 0↵
     Because the whole string↵
When I am supposed **to return, say,  the beginning index of the first matching instance**.↵


Say the pattern is of length M and the document is of length N, and the pattern has K '*' signs. I can think of a solution that first uses AC Automation to find all occurences of each chunk in O(N + M), with bitmask.↵

While converting the bitmask to indexes takes O(N * K)↵

Then binary search for the last possible beginning positions for each chunk. This could take O (K log N)↵



So the overall time complexity is still O (N * K), any way to do better?↵




 **References**↵

https://codeforces.net/blog/entry/111380↵

https://codeforces.net/problemset/problem/1023/A↵

https://codeforces.net/blog/entry/127169↵

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en7 English cardcounter 2024-12-21 23:01:06 18
en6 English cardcounter 2024-12-21 22:58:46 229
en5 English cardcounter 2024-12-21 12:46:01 44 Tiny change: 'm/1023/A\n' -> 'm/1023/A\n\nhttps://codeforces.net/blog/entry/127169\n'
en4 English cardcounter 2024-12-21 12:41:11 2 Tiny change: 'y/111380\nhttps://' -> 'y/111380\n\nhttps://'
en3 English cardcounter 2024-12-21 12:40:45 108
en2 English cardcounter 2024-12-21 12:32:51 7
en1 English cardcounter 2024-12-21 12:30:21 746 Initial revision (published)