String Algorithm: Lyndon Factorization and Duval Algorithm

Правка en7, от SPyofgame, 2021-04-14 15:25:50


Table of content

Teleporter Description
I. Lyndon Factorization Definitions of Lyndon word, Lyndon factorization
II. Duval Algorithm Talk about the duval algorithm and how it works
III. Real Life Applications Motivation to learn the algorithm
IV. Programming Applications The code is short and simple to implement
V. My Questions Are there any other applications ?
................................................................... ..........................................................................................................................










I. Lyndon Factorization



  • Definition: Lyndon word is a nonempty string that is strictly smaller in lexicographic order than all of its rotations.

  • Alternatively: Lyndon word is nonempty and lexicographically strictly smaller than any of its proper suffixes.

  • Alternatively: Let $$$S, T, Z$$$ is nonempty word. $$$S$$$ is Lyndon word if $$$S < T\ \ \ \forall\ S = Z + T$$$

  • Lyndon factorization is that for $$$s = s_1 + s_2 + \dots + s_k$$$ where $$$s_i$$$ is nonempty shortest-able string and in decreasing order $$$s1 \geq s2 \geq \dots \geq s_k$$$

  • Special Property: Lyndon factorization is unique.

Notice that the operator + is string concatenation











II. Duval Algorithm



  • Exist Time: 1983

  • Duval algorithm is an effecient algorithm for listing the Lyndon words of length at most $$$n$$$ with a given alphabet size $$$s$$$ in lexicographic order in $$$O(n)$$$

  • The position of the last Lyndon factorized word from Duval algorithm provides minimum cyclic string

The pseudo algorithm
Implementation - O(n) Time - O(1) Auxiliary Space
Detail Implementation - O(n) Time - O(1) Auxiliary Space










III. Real Life Applications



1) Finger print identification:
  • We can encode the finger print into many detailed circular strings. How to search such finger print again from those in very huge data base ? Circular comparision using lyndon factorization is requried.

2) Biological genetics:
  • In some cases, we need to know whether these two group's genetics are belonged to the same bacteria, virus.

3) Games:
  • Well, ofcourse we can apply the algorithm in some language/words-related games










IV. Programming Applications



1) Least rotation to make smallest lexicographical ordered string.


The problem:
  • Given a string $$$S$$$ of size $$$N$$$

  • A right-rotation is that move the leftmost character to rightmost of $$$S$$$

  • Find the least right-rotation to make $$$S$$$ become the smallest lexicographical ordered string

Important Property: After those right-rotations, the string will be Minimum Acyclic String



The solution:
  • One thing we can come in mind is to use hash or string algorithms in $$$O(n\ log\ n)$$$, but they are kinda complex (I will add the codes of these ways later)

  • Bruteforces Solution: Add all possible acyclic strings into map, the result is the smallest index of the minimal lexicographical string

Bruteforces Solution - O(n^2 log n) Time - O(n^2) Auxiliary Space

  • Dual Elimination: So I found this "kinda-bruteforces-solution" but suprisingly its complexity is almost Linear. I dont know whether if it has a name, so I named it Dual Elimination
Detail Dual-Elimination Solution - O(n) to O(n log n) Time - O(n) Auxiliary Space
None-comment Dual-Elimination Solution - O(n) to O(n log n) Time - O(n) Auxiliary Space
Optimized and Readable Dual-Elimination Solution - O(n) to O(n log n) Time - O(n) Auxiliary Space

  • Duval Solution: We can apply lyndon factorization with a little bit observation
Detail Duval Solution - O(n) Time - O(n) Auxiliary Space
None-comment Duval Solution - O(n) Time - O(n) Auxiliary Space
Optimized Duval Solution - O(n) Time - O(1) Auxiliary Space

Practices Problem:











V. My Question



1) Are there any other programming applications for Lyndon Factorization ?
  • The algorithm is simple to code while running pretty fast as its low complexities. It would be kinda sad if there is only one main application, isnt it :(

2) Are there any other problems for Lyndon Factorization ?
  • To remember something better and understand the algorithm deeper, we need to practice right :D It would be nice if there are some more problems
Теги string algorithm, lyndon words, lyndon factorization, duval algorithm

История

 
 
 
 
Правки
 
 
  Rev. Язык Кто Когда Δ Комментарий
en26 Английский SPyofgame 2021-04-25 17:04:35 91
en25 Английский SPyofgame 2021-04-23 07:30:37 5 Tiny change: '+ ... + sx) cases\n ' -> '+ ... + sx + sy) cases\n '
en24 Английский SPyofgame 2021-04-23 05:33:30 0 (published)
en23 Английский SPyofgame 2021-04-23 05:33:05 128759 Reverted to en21
en22 Английский SPyofgame 2021-04-23 05:32:05 128759 (saved to drafts)
en21 Английский SPyofgame 2021-04-23 03:18:18 240
en20 Английский SPyofgame 2021-04-23 02:58:45 634
en19 Английский SPyofgame 2021-04-22 19:45:39 2 Tiny change: 'x[i] = s[i mod n]$ (mean' -> 'x[i] = s[i\ mod\ n]$ (mean'
en18 Английский SPyofgame 2021-04-22 19:27:38 30
en17 Английский SPyofgame 2021-04-22 19:26:32 17
en16 Английский SPyofgame 2021-04-22 19:19:53 0 (published)
en15 Английский SPyofgame 2021-04-22 19:19:36 137 (saved to drafts)
en14 Английский SPyofgame 2021-04-22 19:13:13 49
en13 Английский SPyofgame 2021-04-17 12:52:26 1961
en12 Английский SPyofgame 2021-04-15 02:55:21 60
en11 Английский SPyofgame 2021-04-14 22:41:27 24 Tiny change: '------\n\n' -> '------\n\n--------------------\n\n'
en10 Английский SPyofgame 2021-04-14 16:57:32 9 Tiny change: '-\n\n- **Better Solution:' -> '-\n\n- **Bruteforces Solution:'
en9 Английский SPyofgame 2021-04-14 16:57:02 5097
en8 Английский SPyofgame 2021-04-14 16:54:37 1196
en7 Английский SPyofgame 2021-04-14 15:25:50 76
en6 Английский SPyofgame 2021-04-14 15:19:41 6687 Tiny change: 'oiler>\n\n------' -> 'oiler>\n\n\n\n------'
en5 Английский SPyofgame 2021-04-14 14:29:22 76
en4 Английский SPyofgame 2021-04-14 04:57:52 12
en3 Английский SPyofgame 2021-04-14 04:55:32 1 Tiny change: '.... |\n\n\n\n------' -> '.... |\n\n \n\n------'
en2 Английский SPyofgame 2021-04-13 16:58:24 439
en1 Английский SPyofgame 2021-04-13 16:54:43 13595 Initial revision (published)