Hi,
Like some other suffix data structures, suffix automaton can be applied to a trie naturally. It is obvious that the number of states and transitions are linear.
There's a natural construction algorithm. Let SAM(T) be the suffix automaton for trie T. If we add a new leaf v to trie T, whose father is u with character c. We can obtain SAM(T add v) by the almost same extend function:
Here is an example of extend function, 'last' is the corresponding state after u added:
int sa_extend (int last, char c) {
int cur = sz++;
st[cur].len = st[last].len + 1;
int p;
for (p=last; p!=-1 && !st[p].next.count(c); p=st[p].link)
st[p].next[c] = cur;
if (p == -1)
st[cur].link = 0;
else {
int q = st[p].next[c];
if (st[p].len + 1 == st[q].len)
st[cur].link = q;
else {
int clone = sz++;
st[clone].len = st[p].len + 1;
st[clone].next = st[q].next;
st[clone].link = st[q].link;
for (; p!=-1 && st[p].next[c]==q; p=st[p].link)
st[p].next[c] = clone;
st[q].link = st[cur].link = clone;
}
}
return cur;
}
I applied this method successfully on some problems. However, how to analyse the time complexity? The amortized analysis of the 'redirect operation' can not be applied to a trie at first glance. If the time complexity is actually not linear, how to hack it?
Auto comment: topic has been updated by BaconLi (previous revision, new revision, compare).
There is already scientific paper about it =) https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/35395.pdf
What's tl;dr? Any O(nΣ) worst-case algorithm?
FYI, in science papers tl;dr is usually written on the first page, it is called an abstract.
.
We need to go deeper...
Jokes away, actually I wanted tl;dr of algorithm. Like "it's just algorithm from this entry but with bfs" or "it's some crazy shit, not appliable to the competitive programming"
Thanks, that's really helpful! I read the time complexity analysis part. It said "The analysis of the total number of redirection iterations relies on an extension of the analysis for the single-string case" and basically nothing detailed. I am still confused about how to "rely on"?
This implementation has O(n2) complexity. Example: trie of strings b, ab, aab, ..., anb. If you will add leaves from bottom to top, it will fail.
Also you'll have broken suffix link tree here and some useless states in automaton. The version without such side-effects is here (with demonstration of TLE): #8SCk0x
But I don't know what's the complexity when you construct it via bfs... Lower bound is O(nΣ).
Thanks, you are right. The algorithm in that paper added leaves via bfs order.