Table of content
Teleporter | Description |
---|---|
I. Lyndon Definitions | Definitions of Lyndon word, Lyndon factorization, ... |
II. Duval Algorithm | Talk about the duval algorithm and how it works |
III. Real Life Applications | Motivation to learn the algorithm |
IV. Programming Applications | The code is short and simple to implement |
V. My Questions | Are there any other applications ? |
................................................................... | .......................................................................................................................... |
I. Lyndon Factorization
1) String Concatenation
Definition: The operation of joining character strings end-to-end
Property I: Non-empty string $$$S$$$ is a concatenation of all substrings of itself
Property II: Non-empty string $$$S$$$ is a concatenation of any empty string at any position in itself with itself
Property III: Let $$$A, B, C$$$ the non-empty string then $$$A + B + C = A + (B + C) = (A + B) + C$$$
For convention, let define the operator + is string concatenation
2) Lyndon Word
Definition: A nonempty string that is strictly smaller in lexicographic order than all of its rotations.
Property I: Lyndon word is nonempty and lexicographically strictly smaller than any of its proper suffixes.
Property II: Let $$$S, T, Z$$$ is nonempty word. $$$S$$$ is Lyndon word if $$$S < T\ \ \ \forall\ S = Z + T$$$
Property III: Lyndon word cant be factorized. It means that the only factor in its factorization is itself.
3) Lyndon Factorization
Definition: Split the string into many lyndon words in such a way that the words in the sequence are in lexicographically non-increasing order.
Property I: For $$$s = s_1 + s_2 + \dots + s_k$$$ where $$$s_1, s_2, \dots, s_k$$$ are lyndon words and in non-increasing order ($$$s1 \geq s2 \geq \dots \geq s_k$$$)
Property II: Lyndon factorization is unique.
Property III: The last Lyndon Factor is Lexicographically Smallest Suffix of the string
4) Least Starting Position (LSP)
Definition: The minimal position of some substrings that make it LMSR.
Property I: Let $$$X$$$ the substring of $$$S$$$ that its starting position $$$p$$$ is LSP. Then some Lyndon Factors in $$$X$$$ has the LSP $$$p$$$
Property II: $$$K$$$-repeated String, where each substring has size $$$L$$$ then there are $$$K$$$ LSP: $$$0, L, 2L, \dots, (K-1)L$$$
Property III: The Circular String start from LSP of given string is Lexicographically Minimal String Rotation
II. Duval Algorithm
Exist Time: 1983
Duval algorithm is an effecient algorithm for listing the Lyndon words of length at most $$$n$$$ with a given alphabet size $$$s$$$ in lexicographic order in $$$O(n)$$$
The position of the last Lyndon factorized word from Duval algorithm provides minimum cyclic string
III. Real Life Applications
1) Finger print identification:
- We can encode the finger print into many detailed circular strings. How to search such finger print again from those in very huge data base ? Circular comparision using lyndon factorization is requried.
2) Biological genetics:
- In some cases, we need to know whether these two group's genetics are belonged to the same bacteria, virus.
3) Games:
- Well, ofcourse we can apply the algorithm in some language/words-related games
IV. Programming Applications
1) Least rotation to make smallest lexicographical ordered string.
The problem:
Given a string $$$S$$$ of size $$$N$$$
A right-rotation is that move the leftmost character to rightmost of $$$S$$$
Find the least right-rotation to make $$$S$$$ become the smallest lexicographical ordered string
Important Property: After those right-rotations, the string will be Minimum Acyclic String
The solution:
One thing we can come in mind is to use hash or string algorithms in $$$O(n\ log\ n)$$$, but they are kinda complex
Some other approachs can be found here
- Bruteforces Solution: Let $$$t = s + s$$$. Then for each substring of size $$$|s|$$$, we compare and find the smallest
- Duval Solution: We can apply lyndon factorization with a little bit observation
Practices Problem:
V. My Question
1) Are there any other programming applications for Lyndon Factorization ?
- The algorithm is simple to code while running pretty fast as its low complexities. It would be kinda sad if there is only one main application, isnt it :(
2) Are there any other problems for Lyndon Factorization ?
- To remember something better and understand the algorithm deeper, we need to practice right :D It would be nice if there are some more problems
In the Dual Elimination Solution, if it has a name, please tag me in
There is a funny thing that the solution still work fine with odd length candidate list
And I think there is an improvement to make it completely Linear
EDIT: I gonna make another blog for that
Auto comment: topic has been updated by SPyofgame (previous revision, new revision, compare).
Added a little bit more main properties and definitions that might help you understand the algorithm
You can use Lyndon words to construct a De Bruijn Sequence. See https://github.com/bqi343/USACO/blob/master/Implementations/content/combinatorial%20(11.2)/DeBruijnSeq.h implementation from Benq, which cites this problem.
P.S. sorry for full URL, it seems CF does not like it being embedded.
Thank you, I will add the De Bruijn Sequence Construction soon
It is failing on this testcase
input : abaabbaa
correct output :aaabaabb
user output :aabaabba
int duval(const string &s) {
}
int main(){
}