String Algorithms - Codeforces

→ Обратите внимание

До соревнования
CodeTON Round 9 (Div. 1 + Div. 2, Rated, Prizes!)
30:41:51
Зарегистрироваться »

*есть доп. регистрация

→ Лидеры (рейтинг)

№	Пользователь	Рейтинг
1	tourist	4009
2	jiangly	3823
3	Benq	3738
4	Radewoosh	3633
5	jqdai0815	3620
6	orzdevinwang	3529
7	ecnerwala	3446
8	Um_nik	3396
9	ksun48	3390
10	gamegame	3386

Страны | Города | Организации

Всё →

→ Лидеры (вклад)

№	Пользователь	Вклад
1	cry	167
2	Um_nik	163
3	maomao90	162
3	atcoder_official	162
5	adamant	159
6	-is-this-fft-	158
7	awoo	157
8	TheScrasse	154
9	Dominater069	153
9	nor	153

Всё →

→ Найти пользователя

→ Прямой эфир

Детальнее →

Блог пользователя dragoon

String Algorithms

Автор dragoon, 11 лет назад, По-английски

Three popular algorithms for string related problems are: Suffix Array, Suffix Automaton and Suffix Tree. So what are the advantages/disadvantages of each of these? Is there any types of problems which is easy to tackle by one of these 3, but not by other 2? Let's gather these types of information in this post. :)

string, suffix array, suffix automata

dragoon
11 лет назад
10

Комментарии (9)

Показать архивные | Написать комментарий?

asobolev

11 лет назад, # |

+25

From theoretical perspective there is a papper Abouelhoda, Kurtz, Ohlebusch: Replacing suffix trees with enhanced suffix arrays claiming that any problem solvable by suffix tree can be solved with same time complexity using [enhanced] suffix arrays:

Abstract
The suffix tree is one of the most important data structures in string processing and comparative genomics. However, the space consumption of the suffix tree is a bottleneck in large scale applications such as genome analysis. In this article, we will overcome this obstacle. We will show how every algorithm that uses a suffix tree as data structure can systematically be replaced with an algorithm that uses an enhanced suffix array and solves the same problem in the same time complexity. The generic name enhanced suffix array stands for data structures consisting of the suffix array and additional tables. Our new algorithms are not only more space efficient than previous ones, but they are also faster and easier to implement.

→ Ответить

yeputons

11 лет назад, # |

+49

Here is my humble opinion:

Suffix array. It's the simplest structure, but is built in $\text{[math]}$ instead of linear time. Requires linear memory. It's usually easy to reforumulate complex problem in terms of suffix array, but the solution is not always easy to implement: tons of binary searches and LCPs usage is common. It's not so hard, but can be frightening in the beginning.
Suffix tree. It can be built online in linear time (in contrast with suffix array), but require up to O(nΣ) memory (Σ is a size of the alphabet). Solutions that use it are very similar to ones using suffix array. They are usually a bit simpler to implement, though.
Suffix automaton. It's the most unintuitive structure for me. Time and memory consumption concide with suffix tree's. However, it is easier to implement. Set of problems solved by automaton differ from the one of tree and array a lot. Automaton uses 'right contexts' instead of prefixes and it can be quite confusing, as it is for me. So, I don't use automaton, if the problem is not about "writing an automaton".

I prefer suffix array on contests, but automaton may be preferrable in some cases. In contrast, my teammate likes automaton more than array. I don't remember a problem which can be solved with suffix tree only (except the training ones from Summer Informatics School).

→ Ответить

dj3500

11 лет назад, # ^ |

+19

The suffix array can be built in linear time too, and certainly faster than a suffix tree, using the Karkkainen algorithm (I've also seen it under the name DC3). Most implementations look daunting, but it is possible to prepare one which is both fast and short (easy/fast to write from a code-notebook). Of course, the nlogn one is way simpler.

→ Ответить

ftiasch

11 лет назад, # ^ |

i prefer suffix array too~

however, some problem with strict time limit may force me to use automaton :(

→ Ответить

jlcastrillon

11 лет назад, # |

There is another type of suffix array called Dynamic Extended Suffix Array which is used for online suffix array problems, here is the link http://www-igm.univ-mlv.fr/~lecroq/articles/jda2009.pdf, in practice is not so easy to implement but is very useful.

→ Ответить

aajjbb

11 лет назад, # |

I'm currently learning all this string data-structures and I'm having troubles on how to implement them, do someone have a 'clear' implementation of them for a good understanding ? The ones I managed to find are all obscure and weird, making everything harder for beginners.

→ Ответить

jlcastrillon

11 лет назад, # |

In this book, almost all the algorithms on texts are well explained, including suffix automaton.

http://www.google.com.cu/url?sa=t&rct=j&q=&esrc=s&source=web&cd=8&cad=rja&ved=0CGEQFjAH&url=http%3A%2F%2Fbooks.google.com%2Fbooks%2Fabout%2FText_Algorithms.html%3Fid%3DtSxLkRIyvoIC&ei=6d4sUsPONYfY9ATkyoCgBA&usg=AFQjCNFEkZyk_QHOcQaTzFldcBJ93rcO7g&bvm=bv.51773540,d.eWU

→ Ответить

Mintoo

10 лет назад, # |

Many texts say that there is a relation between suffix automata of a string and suffix tree of the reverse string. I could not grab that. It would be great, if someone could explain that.

→ Ответить

adamant

10 лет назад, # ^ |

Suffix tree of the reverse string == suffix link tree of automaton of the string

→ Ответить

Соревнования по программированию 2.0

Время на сервере: 22.11.2024 10:53:09 (k1).

Десктопная версия, переключиться на мобильную.

При поддержке