Hi! I'm having trouble understanding this O(NlogN) implementation of suffix array construction :
Code
In particular:
for (i = 0; i < n; i++) { // count the frequency of each integer rank
c[i + k < n ? RA[i + k] : 0]++;
}
0 is used as the rank for an empty string when it is also the rank of the smallest suffix, as seen below.
tempRA[SA[0]] = r = 0; // re-ranking; start from rank r = 0
// compare adjacent suffixes
for (i = 1; i < n; i++) {
// if same pair => same rank r; otherwise,increase r
tempRA[SA[i]] = (RA[SA[i]] == RA[SA[i - 1]] && RA[SA[i] + k] == RA[SA[i - 1] + k]) ? r : ++r;
}
SA[i] + k can go out of bounds.
These seem like bugs, but the algorithm works, but strangely only when a sentinel character such as $ is appended to the end. Why?
Thanks in advance!