Help with Aho-Corasick Problem

→ Pay attention

Before contest
Codeforces Round 1006 (Div. 3)
2 days
Register now »

→ Top rated

#	User	Rating
1	tourist	3856
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3462
9	ecnerwala	3451
10	heuristica	3431

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	cry	167
2	-is-this-fft-	162
3	Dominater069	160
4	Um_nik	158
5	atcoder_official	157
6	Qingyu	156
7	djm03178	152
7	adamant	152
9	luogu_official	151
10	awoo	147

View all →

→ Find user

→ Recent actions

Detailed →

alv-r-'s blog

Help with Aho-Corasick Problem

By alv-r-, 10 years ago, In English

Hi, I've being trying to solve this problem on SPOJ: http://www.spoj.com/problems/WPUZZLES/ (which I've seen in many places as a recommended Aho-Corasick problem).

I'm currently getting WA on it and can't figure out why, could someone please help giving me a hint about what is wrong? (Or maybe giving me some test cases where it fails).

My current code is this: http://pastebin.com/vzLNwKk6 The aho-corasick part is based on this implementation: https://gist.github.com/andmej/1233426

My idea is: put both the word and its reverse on keywords[], then run on the matching machine with each line of the puzzle. I'm currently running 2 directions at a time in the loop (4 with the reverses). Directions 'A', 'C', 'E', 'G' in one and the remaining in another.

As far as I can tell, a matching is missing when the code is printing the answer, have no idea under what circumstances this happens though.

Any help is appreciated!

aho-corasick, spoj, help

alv-r-
10 years ago
7

Comments (7)

Write comment?

kien_coi_1997

10 years ago, # |

← Rev. 2 →

"You can assume that each word can be found exactly once in the word puzzle."

I don't know if test data contains following input.

1

1 5 3
ABCBA
C
BCB
ABCBA

It causes a lot of possible output.

And what about the following input?

Is it valid?

→ Reply

alv-r-

10 years ago, # ^ |

I thought about that, the problem doesn't specify how you should print in case of palindromes (because you can have two possible answers for them).

So my guess is that "You can assume that each word can be found exactly once in the word puzzle" applies for this case (no anagrams, since you could find them twice, depending on interpretation).

I tried to change the search order in a way that would print the "first" match (meaning the one with smaller i and j for its coordinates), still WA :(

→ Reply

knightL

10 years ago, # |

+16

It seems your program fails on this test:

1
1 3
2  
ABC
ABC
CBA

→ Reply

alv-r-

10 years ago, # ^ |

That was exactly the problem, when checking isrev["CBA"], for example, I would consider it a reverse string and add the result for "ABC", ignoring that "CBA" is also there.

Fixed, got AC, thanks a lot! :)

→ Reply

chocoguy

10 years ago, # |

Can someone explain the solution? I am familiar with Aho-Corasick, but I can not figure out how to apply it here. Thanks in advance!

→ Reply

alv-r-

10 years ago, # ^ |

You can consider the puzzle to be a collection of texts, while the words to be found are the keywords. By a collection of texts I mean that each row, column or diagonal (+ the inverse of each) is a text.

So, if you build a matching automaton (using aho corasick) with the keywords, you can run each 'text' on it and see what keywords are in there in O(n) (n = length of the text).

What I did was consider both the actual word and its inverse as keywords, so you can run only the 'texts' in a single direction and find the ones that would be a match for its reversed counterpart. (Cause you are also guaranteed that each word appears only once). Then I match 2 directions (4 counting the reversed ones) at a time, using the same iteration, but keeping two different 'currentStates'. (In short: only two matrix iterations for all the 8 directions).

As far as I've seen, on these 2d matching problems you can usually use just some sort of complete search with backtracking, but in this case (1000x1000 board, 1000 length keywords) this is too slow, hence the need for aho corasick (I've also seen someone else do it with rabin-karp).

→ Reply

chocoguy

10 years ago, # ^ |

thank you very much, alv-r-

→ Reply