№	Пользователь	Рейтинг
1	tourist	4009
2	jiangly	3823
3	Benq	3738
4	Radewoosh	3633
5	jqdai0815	3620
6	orzdevinwang	3529
7	ecnerwala	3446
8	Um_nik	3396
9	ksun48	3390
10	gamegame	3386

№	Пользователь	Вклад
1	cry	165
2	maomao90	163
2	Um_nik	163
4	atcoder_official	161
5	adamant	160
6	-is-this-fft-	158
7	awoo	157
8	TheScrasse	154
9	nor	153
9	Dominater069	153

Table of content

Teleporter	Description
I. The Problem	Describe about the problem
II. Bruteforce Approach	Simple way to do the problem
III. Hashing Approach	Reduce circular substring comparing complexity
IV. Sqrt Decomposition	Divide into parts, solve each part then solve the whole part
V. KMP Approach	Magical Linear Booth-KMP Algorithm
VI. Lyndon Factorization Approach	Incredible Linear Duval-Lyndon Algorithm
VII. Suffix Array Approach	Hard for constructing but simple to find result
VIII. Suffix Automation Approach	Complex Linear Construction but simple to find result
IX. Elimination Tournament Algorithm	From initial candidates, eliminate worst candidates until find one final champion
...................................................................	..........................................................................................................................

I. The Problem

A) The problem:

1) Statement:

Sometimes being as a coder, you will find some real life problems about strings. Some strings are just simple an array of characters with 2 ended. Sometimes you will face with circular strings, the one which circular around. Let take you to the biological laboratory, dont worry, I just teleport you without requiring any money. In the lab, you will find many interesting bacteria. How could you detect which one is belonged to the same group or not ? Let simplify the problem that two bacteria are in the same group if their circular encoded-DNA strings are the same. So how can you compare two randomly encoded-DNA ? One of the effectively way to do that is convert all strings into Minimal Lexicographical Acyclic Strings (a.k.a Lexicographically Minimal String Rotation) and then hash them for faster comparison.

This bring down to a much simpler problem. Let define a right-rotation of a string is that putting the leftmost character to the rightmost position. Given circular DNA string $$$S$$$ of size $$$n$$$. Find the minimum right-rotation to make it Minimal Lexicographical for all such rotations.

2) Question:

Given a string of latin words $$$S$$$ of size $$$N$$$
A right-rotation of $$$S$$$ is that $$$S_2 + S_3 + ... + S_{|S|} + S_1$$$ where ('+' notation mean concatenation)
Find the least right-rotation to make $$$S$$$ become the Lexicographically Minimal

3) Constraint:

$$$S$$$ is a string of lower latin (you can expand the problem to $$$A$$$ is an array of integers)
$$$|S| \leq 10^5$$$ (you can also expand the limit to $$$10^6$$$ or even higher)

4) Example:

Example 1

Input:  a
Output: 0
String: a

Example 2

Input:  ba
Output: 1
String: ab

Example 3

Input:  aaaaaa
Output: 0
String: aaaaaa

Example 4

Input:  aaaaab
Output: 0
String: aaaaab

Example 5

Input:  aaaaba
Output: 5
String: aaaaab

Example 6

Input:  aaabaa
Output: 4
String: aaaaab

Example 7

Input:  aabaaa
Output: 3
String: aaaaab

Example 8

Input:  abaaaa
Output: 2
String: aaaaab

Example 9

Input:  baaaaa
Output: 1
String: aaaaab

Example 10

Input:  baaaab
Output: 1
String: aaaabb

Example 11

Input:  abbbba
Output: 5
String: aabbbb

Example 12

Input:  baabaa
Output: 1
String: aabaab

Example 13

Input:  abaabaaabaababaaabaaababaab
Output: 14
String: aaabaaababaababaabaaabaabab

All Examples As Once

Input:  a ba aaaaaa aaaaab aaaaba aaabaa aabaaa abaaaa baaaaa baaaab abbbba baabaa abaabaaabaababaaabaaababaab
Output: 0  1 0      0           5     4     3     2     1      1          5  1                   14 
Output: 0 1 0 0 5 4 3 2 1 1 5 1 14 
String: a ab aaaaaa aaaaab aaaaab aaaaab aaaaab aaaaab aaaaab aaaabb aabbbb aabaab aaabaaababaababaabaaabaabab

B) Real Life Applications

1) Finger print identification:

We can encode the finger print into many detailed circular strings. How to search such finger print again from those in very huge data base ? Circular comparision using lyndon factorization is requried.

2) Biological genetics:

In some cases, we need to know whether these two group's genetics are belonged to the same bacteria, virus.

3) Games:

Well, ofcourse we can apply the algorithm in some language/words-related games

C) Practice Links

1) CSES — Minimal Rotation

Require: Output least rotation string with $$$|S| \leq 10^6$$$

2) SPOJ — Minimum Rotations

Require: Output least rotation number with $$$|S| \leq 10^5$$$

3) UVA — Glass Beads

Require: Queries output least rotation number with $$$|S| \leq 10^4$$$

II. Bruteforce Apprach

A) Sorting and Searching:

1) Idea:

We will take all possible circular substring in $$$S$$$
For every possible rotation of the string, we add it with its number of rotations needed
Then we sort all the array by their string (if they are equal, we take one with its smaller index).
The answer will be the smallest string (minimal lexicographical) with its lowest index.

2) Complexity:

Compare two random strings $$$S$$$ and $$$T$$$ cost $$$O(min(|S|, |T|)) = O(n)$$$ (since $$$|S| = |T| = n$$$)
The complexity of sorting all array is $$$O(n log n)$$$
Hence the total complexity is $$$O(n^2 log n)$$$

3) Implementations:

Vector Sorting Bruteforce Solution - O(n^2 log(n)) time - O(n^2) auxiliary space

#include <algorithm>
#include <iostream>
#include <vector>

using namespace std;
    
int min_cyc(string s)
{
    vector<pair<string, int> > V; 
    for (int i = 0; i < s.size(); ++i)
    {
        V.push_back(make_pair(s, i));
        rotate(s.begin(), s.begin() + 1, s.end());
    }

    sort(V.begin(), V.end());
    return V.front().second;
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

Map Sorting Bruteforce Solution - O(n^2 log(n)) time - O(n^2) auxiliary space

#include <algorithm>
#include <iostream>
#include <map>

using namespace std;
    
int min_cyc(string s)
{
    map<string, int> M;
    for (int i = 0; i < s.size(); ++i)
    {
        if (!M.count(s)) M[s] = i;
        rotate(s.begin(), s.begin() + 1, s.end());
    }
    return (M.begin() -> second);
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

B) Loop over all rotations:

1) Idea:

Why should we store then sort anyway, we just need to loop over all rotations and comparing to select the smaller

2) Implementations:

For convention, you can just duplicate the string $$$T = S + S$$$
Then $$$S$$$ at the $$$ith$$$ rotations ($$$0 \leq i < n$$$) is the string $$$T[i \dots i + n]$$$

Bruteforce Solution - O(n^2) time - O(n) auxiliary space

#include <iostream>

using namespace std;

int min_cyc(string s)
{
    int n = s.size();
    s += s;
    
    int t = 0;
    for (int i = 1; i < n; ++i)
        if (s.substr(i, n) < s.substr(t, n))
            t = i;

    return t;
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

3) Optimization:

By quickly break when the two strings arent equal, this can optimize the function a little bit
Instead of making duplicated string, you can also define $$$s[x] = s[x\ mod\ |s|]$$$ since $$$s$$$ itself is a circular string
Instead of taking modulo, since $$$0 \leq x < 2 * n$$$, we just need reduce $$$x$$$ by $$$n$$$ whenever it passed $$$n$$$

Optimized Bruteforce Solution - O(n^2) time - O(1) auxiliary space

#include <iostream>

using namespace std;

int min_cyc(const string &s)
{
    int n = s.size();

    int t = 0;
    for (int i = 1; i < n; ++i)
    {
        int cmp = 0; /// EQUAL
        for (int p = n, l = t, r = i; p > 0; --p, ++l, ++r)
        {
            if (l == n) l = 0;
            if (r == n) r = 0;
            if (s[l] < s[r]) { cmp = -1; break; } ///  LESS 
            if (s[l] > s[r]) { cmp = +1; break; } /// GREATER
        }

        if (cmp == +1) t = i;
    }
    
    return t;
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

C) Substring Based Elimination

1) Idea:

From inital set of candidates, keep comparing substrings and eliminate the bigger one until we get the final champion

2) Algorithm:

First, let $$$mn$$$ / $$$mx$$$ is the minimal / maximal character of $$$s$$$

Define: $$$mn = \underset{c \in S}{min}(c)$$$ and $$$mx = \underset{c \in S}{max}(c)$$$

Pre-elimination Round: We take the position $$$p$$$ that $$$s[p] = mn$$$. Since all other positions wont provide Minimal Lexicographical Circular String

Define: $$$candidate = $$$ { $$$p\ \ |\ \ p \in $$$ { $$$0, 1, \dots, |s| - 1$$$ } $$$ \cap s[p] = mn$$$ }

Then we take maximumly $$$n - 1$$$ Rounds from $$$d = 1$$$ to $$$d = n - 1$$$. For all current candidate, add the next character to it. Find the minimal substring, and eliminater unsatisfied others.

2) Optimization

Optimization: At the $$$p$$$ round we will find $$$c = $$$ minimal character among all candidate's $$$pth$$$-next character, then traverse again and eliminate the one whose $$$pth$$$-nxt character is greater than $$$c$$$
Define: $$$s[x] = s[x\ mod\ |s|]$$$ and $$$c = \underset{p \in\ candidate}{min}(s[p + d])$$$ and $$$next\ candidate = $$$ { $$$p\ \ |\ \ p \in candidate \cap s[p + d] = c$$$ }

3) Example:

$$$S = $$$"$$$abaabaaabaababaaabaaababaab$$$"

Pre-elimination - 26 candidates

Candidate: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    abaabaaabaababaaabaaababaab
Candidate Substring:
0 - a 
1 - b
2 - a
3 - a
4 - b
5 - a
6 - a
7 - a
8 - b
9 - a
10 - a
11 - b
12 - a
13 - b 
14 - a
15 - a
16 - a
17 - b
18 - a
19 - a
20 - a
21 - b
22 - a
24 - a
25 - a
26 - b
Winner: a*

Round I - 18 candidates

Candidate: 0 2 3 5 6 7 9 10 12 14 15 16 18 19 20 22 24 25 
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    a_aa_aaa_aa_a_aaa_aaa_a_aa_
Candidate Substring:
0 - ab
2 - aa
3 - ab
5 - aa
6 - aa
7 - ab
9 - aa
10 - ab
12 - ab 
14 - aa
15 - aa
16 - ab
18 - aa
19 - aa
20 - ab
22 - ab
24 - aa
25 - ab
Winner: aa*

Round II - 9 candidates

Candidate: 2 5 6 9 14 15 18 19 24 
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    __aa_aa__a____aa__aa____a__
Candidate Substring:
2 - aab
5 - aaa
6 - aab
9 - aab
14 - aaa
15 - aab
18 - aaa
19 - aab
24 - aab
Winner: aaa*

Round III - 3 candidates

Candidate: 5 14 18 
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    _____a________a___a________
Candidate Substring:
5 - aaab
14 - aaab
18 - aaab
Winner: aaab*

Round IV - 3 candidates

Candidate: 5 14 18 
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    _____a________a___a________
Candidate Substring:
5 - aaaba
14 - aaaba
18 - aaaba
Winner: aaaba*

Round V - 3 candidates

Candidate: 5 14 18 
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    _____a________a___a________
Candidate Substring:
5 - aaabaa
14 - aaabaa
18 - aaabab
Winner: aaabaa*

Round VI - 2 candidates

Candidate: 5 14
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    _____a________a____________
Candidate Substring:
5 - aaabaab
14 - aaabaaa
Winner: aaabaaa*

Champion

Winner:   14 - VI
Number:   012345678901234567890123456
String:   abaabaaabaababaaabaaababaab
Remain:   ______________a____________
Champion: aaabaaa*

4) Notice

After all eliminations, if there are more than one candidate, select the one with lowest index

5) Complexity

If the string is $$$k$$$-repeated strings (each substring size $$$d$$$) or similar then there will be atleast $$$k$$$ candidate after $$$d$$$ eliminations. About $$$O(k \times d)$$$ in the best cases
Best case to say, like a string where all characters are unique, then it will be Linear $$$O(n)$$$
Worst case to say, like a string where all characters are the same, then it will be $$$O(n \times n) = O(n^2)$$$

6) Implementations

For convention, let not use obviously-not-optimized code

Detail Substring Based Elimination - O(n^2) time - O(n) auxiliary space

#include <algorithm>
#include <iostream>
#include <vector>

using namespace std;

/// Finding minimal string right-rotation to make string minimal lexicographical
int min_cyc(string s)
{
    int n = s.size(); /// For convention: n = |s|
    s += s;

    /// For convention
    char mx = *max_element(s.begin(), s.end());
    char mn = *min_element(s.begin(), s.end());

    /// Candidate list
    vector<int> a;

    /// Pre-elimination
    for (int i = 0; i < n; ++i)
        if (s[i] == mn)
            a.push_back(i);

    /// Doing dth elimination
    for (int d = 1; d < n; ++d)
    {
        /// Minimal character among substrings
        char c = mx;
        for (int x : a) c = min(c, s[x + d]);

        /// New candidate list
        vector<int> b;

        /// Elimination
        for (int x : a)
            if (s[x + d] == c)
                b.push_back(x);

        swap(a, b);
        if (a.size() == 1) break; /// Found final candidate
    }

    return a.front(); /// The one with smallest index
}

int main()
{
    /// Input
    string s;
    cin >> s;

    /// Output
    cout << min_cyc(s);
    return 0;
}

Noncomment Substring Based Elimination - O(n^2) time - O(n) auxiliary space

#include <algorithm>
#include <iostream>
#include <vector>

using namespace std;

int min_cyc(string s)
{
    int n = s.size();
    s += s;

    char mx = *max_element(s.begin(), s.end());
    char mn = *min_element(s.begin(), s.end());
    vector<int> a;
    for (int i = 0; i < n; ++i)
        if (s[i] == mn)
            a.push_back(i);

    for (int d = 1; d < n; ++d)
    {
        char c = mx;
        for (int x : a) c = min(c, s[x + d]);

        vector<int> b;
        for (int x : a)
            if (s[x + d] == c)
                b.push_back(x);

        swap(a, b);
        if (a.size() == 1) break;
    }

    return a.front();
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

III. Hashing Apprach

A) Loop over all rotations

1) Bruteforces Algorithm:

Iterating over all rotations and comparing for finding the smaller

If they are fully equal, then that position not exceed, then we take the one which smaller index
Else find the first position of character that they are difference, which character is smaller, the which of it is also smaller

Bruteforce Approach - O(n^2 log n) time - O(n) auxiliary space

#include <iostream>
#include <vector>

using namespace std;

int min_cyc(string s)
{
    int n = s.size();
    s += s;

    int t = 0;
    for (int i = 1; i < n; ++i)
    {
        int lcp = 0;
        for (int l = 0, r = n - 1; l <= r; )
        {
            int m = (l + r) >> 1;
            if (s.substr(t, m) == s.substr(i, m))
            {
                lcp = m;
                l = m + 1;
            }
            else 
            {
                r = m - 1;
            }
        }

        if (lcp == n) continue; /// Equal
        if (s[t + lcp] > s[i + lcp]) t = i;
    }

    return t;
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

2) Hashing Improvement:

We reduce the complexity by using binary search to find the first different position of two strings:
Let $$$A$$$ is the circular substring of $$$s$$$ with the starting point $$$l$$$
Let $$$B$$$ is the circular substring of $$$s$$$ with the starting point $$$r$$$
Let $$$lcp$$$ (longest common prefix) is the last position that $$$A$$$ and $$$B$$$ equal
Let $$$t = s + s$$$, for convention of circular string
For every $$$p \leq lcp$$$, we have $$$t[l + p - 1] = t[r + p - 1]$$$
For every $$$p > lcp$$$, we have $$$t[l + p - 1] \neq t[r + p - 1]$$$
Hence we can use binary search to find $$$lcp$$$
Fully equal case is that $$$lcp = n$$$
If they are difference, compare $$$t[l + lcp]$$$ with $$$t[r + lcp]$$$

3) Implementations:

Single Hashing Approach - O(n log(MOD)) precalculation - O(n log n) time - O(n) auxiliary space

#include <iostream>
#include <vector>

using namespace std;

int MOD = 1e9 + 7;
int BASE = 123;
int powMOD(int x, int n)
{
    if (n == 0) return 1;
    if (n & 1) return (1LL * powMOD(x, n - 1) * x) % MOD;
    return (1LL * powMOD(x, n / 2) * powMOD(x, n / 2)) % MOD;
}

vector<int> P;
vector<int> H;
void init(const string &s)
{
    int n = s.size();
    int b = 1;
    
    P.assign(n, 0);
    H.assign(n + 1, 0);
    for (int i = 0; i < n; ++i)
    {
        H[i + 1] = (H[i] + 1LL * b * (s[i] - 'a' + 1)) % MOD;
        P[i] = powMOD(b, MOD - 2);
        b = (1LL * b * BASE) % MOD;
    }
}

int query(int l, int r)
{
    return (1LL * (H[r] - H[l] + MOD) * P[l]) % MOD;
}

int min_cyc(string s)
{
    int n = s.size();
    s += s;

    init(s);
    int t = 0;
    for (int i = 1; i < n; ++i)
    {
        int lcp = 0;
        for (int l = 0, r = n - 1; l <= r; )
        {
            int m = (l + r) >> 1;
            if (query(t, t + m) == query(i, i + m))
            {
                lcp = m;
                l = m + 1;
            }
            else 
            {
                r = m - 1;
            }
        }

        if (lcp == n - 1) continue; /// Equal
        if (s[t + lcp] > s[i + lcp]) t = i;
    }

    return t;
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

Multiple Hashing Approach - O(n Sigma(log(MOD))) precalculation - O(n log n * k) time - O(n * k) space

#include <iostream>
#include <vector>

using namespace std;

template <const int MOD, const int BASE>
struct Hash 
{
    int powMOD(int x, int n)
    {
        if (n == 0) return 1;
        if (n & 1) return (1LL * powMOD(x, n - 1) * x) % MOD;
        return (1LL * powMOD(x, n / 2) * powMOD(x, n / 2)) % MOD;
    }

    vector<int> P;
    vector<int> H;
    void init(const string &s)
    {
        int n = s.size();
        int b = 1;
     
        P.assign(n, 0);
        H.assign(n + 1, 0);
        for (int i = 0; i < n; ++i)
        {
            H[i + 1] = (H[i] + 1LL * b * (s[i] - 'a' + 1)) % MOD;
            P[i] = powMOD(b, MOD - 2);
            b = (1LL * b * BASE) % MOD;
        }
    }

    int query(int l, int r)
    {
        return (1LL * (H[r] - H[l] + MOD) * P[l]) % MOD;
    }
};

struct Pack
{
    int a, b, c, d, e, f;
    Pack (int a = 0, int b = 0, int c = 0, int d = 0, int e = 0, int f = 0)
    : a(a), b(b), c(c), d(d), e(e), f(f) {}

    bool operator == (const Pack &o)
    {
        return (a == o.a)
            && (b == o.b)
            && (c == o.c)
            && (d == o.d)
            && (e == o.e)
            && (f == o.f);
    }
};

struct Multihash
{
    Hash <1000000007, 123> A;
    Hash <1000000009, 1234> B;
    Hash <1000000021, 12345> C;
    Hash <1000000033, 123456> D;
    Hash <1000000087, 1234567> E;
    Hash <1000000093, 12345678> F;
    void init(const string &s)
    {
        A.init(s);
        B.init(s);
        C.init(s);
        D.init(s);
        E.init(s);
        F.init(s);
    }

    Pack query(int l, int r)
    {
        return Pack(
            A.query(l, r),
            B.query(l, r),
            C.query(l, r),
            D.query(l, r),
            E.query(l, r),
            F.query(l, r)
        );
    }
};

Multihash H;
int min_cyc(string s)
{
    int n = s.size();
    s += s;

    H.init(s);
    int t = 0;
    for (int i = 1; i < n; ++i)
    {
        int lcp = 0;
        for (int l = 0, r = n - 1; l <= r; )
        {
            int m = (l + r) >> 1;
            if (H.query(t, t + m) == H.query(i, i + m))
            {
                lcp = m;
                l = m + 1;
            }
            else 
            {
                r = m - 1;
            }
        }

        if (lcp == n - 1) continue; /// Equal
        if (s[t + lcp] > s[i + lcp]) t = i;
    }

    return t;
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

4) Optimization:

Hash are so heavy of hidden constant, obviously most by modulo operators, but you can have some tricks to solve it
Significant improvement: Declare $$$MOD,\ BASE,\ LIM$$$ as const or constexpr
In Single Hash, you can use overflow modulo for significant faster but it is also dangerous in same cases (especially hacking)
Replace vector with pre-declared array
Replace recursive power function by iterative one
Improve time to calculate inverse modulo. You can do it linear but cost more space and modulo operators, so it is better to do like below.

Optimized Bruteforce Approach - O(n^2 log n) time - O(n) auxiliary space

#include <iostream>

using namespace std;

int min_cyc(const string &s)
{
    int t = 0;
    int n = s.size();
    for (int i = 1; i < n; ++i)
    {
        int lcp = 0;
        for (int l = t, r = i; lcp < n; ++lcp, ++l, ++r)
        {
            if (l == n) l = 0;
            if (r == n) r = 0;
            if (s[l] != s[r]) break;
        }

        if (lcp == n) continue; /// Equal
        if (s[t + lcp] > s[i + lcp]) t = i;
    }

    return t;
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

Optimized Single Hashing Approach - O(n + log(MOD)) precalculation - O(n log n) time - O(n) space

#include <iostream>
#include <cstring>

using namespace std;

const int LIM = 2e5 + 25;
const int MOD = 1e9 + 7;
const int BASE = 123;
int powMOD(int x, int n)
{
    int res = 1;
    for (; n > 0; n >>= 1)
    {
        if (n & 1) res = (1LL * res * x) % MOD;
        x = (1LL * x * x) % MOD;
    }

    return res;
}

int H[LIM];
int P[LIM];
void init(const string &s)
{
    int n = s.size();
    int b = 1;
    
    H[0] = 0;
    for (int i = 0; i < n; ++i)
    {
        H[i + 1] = (H[i] + 1LL * b * (s[i] - 'a' + 1)) % MOD;
        b = (1LL * b * BASE) % MOD;
    }

    P[n] = powMOD(b, MOD - 2);
    for (int i = n - 1; i >= 0; --i)
        P[i] = (1LL * P[i + 1] * BASE) % MOD;
}

int query(int l, int r)
{
    return (1LL * (H[r] - H[l] + MOD) * P[l]) % MOD;
}

int min_cyc(string s)
{
    int n = s.size();
    s += s;

    init(s);
    int t = 0;
    for (int i = 1; i < n; ++i)
    {
        int lcp = 0;
        for (int l = 0, r = n - 1; l <= r; )
        {
            int m = (l + r) >> 1;
            if (query(t, t + m) == query(i, i + m))
            {
                lcp = m;
                l = m + 1;
            }
            else 
            {
                r = m - 1;
            }
        }

        if (lcp == n - 1) continue; /// Equal
        if (s[t + lcp] > s[i + lcp]) t = i;
    }

    return t;
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

Optimized Multiple Hashing Approach - O(n Sigma(log((MOD))) precalculation - O(n log n * k) time - O(n * k) space

#include <iostream>

using namespace std;

template <const int LIM, const int MOD, const int BASE>
struct Hash 
{
    int powMOD(int x, int n)
    {
        int res = 1;
        for (; n > 0; n >>= 1)
        {
            if (n & 1) res = (1LL * res * x) % MOD;
            x = (1LL * x * x) % MOD;
        }

        return res;
    }

    int H[LIM];
    int P[LIM];
    void init(const string &s)
    {
        int n = s.size();
        int b = 1;
     
        H[0] = 0;
        for (int i = 0; i < n; ++i)
        {
            H[i + 1] = (H[i] + 1LL * b * (s[i] - 'a' + 1)) % MOD;
            b = (1LL * b * BASE) % MOD;
        }

        P[n] = powMOD(b, MOD - 2);
        for (int i = n - 1; i >= 0; --i)
            P[i] = (1LL * P[i + 1] * BASE) % MOD;
    }

    int query(int l, int r)
    {
        return (1LL * (H[r] - H[l] + MOD) * P[l]) % MOD;
    }
};

struct Pack
{
    int a, b, c, d, e, f;
    Pack (int a = 0, int b = 0, int c = 0, int d = 0, int e = 0, int f = 0)
    : a(a), b(b), c(c), d(d), e(e), f(f) {}

    bool operator == (const Pack &o)
    {
        return (a == o.a)
            && (b == o.b)
            && (c == o.c)
            && (d == o.d)
            && (e == o.e)
            && (f == o.f);
    }
};

struct Multihash
{
    Hash <200200, 1000000007, 123> A;
    Hash <200200, 1000000009, 1234> B;
    Hash <200200, 1000000021, 12345> C;
    Hash <200200, 1000000033, 123456> D;
    Hash <200200, 1000000087, 1234567> E;
    Hash <200200, 1000000093, 12345678> F;
    void init(const string &s)
    {
        A.init(s);
        B.init(s);
        C.init(s);
        D.init(s);
        E.init(s);
        F.init(s);
    }

    Pack query(int l, int r)
    {
        return Pack(
            A.query(l, r),
            B.query(l, r),
            C.query(l, r),
            D.query(l, r),
            E.query(l, r),
            F.query(l, r)
        );
    }
};

Multihash H;
int min_cyc(string s)
{
    int n = s.size();
    s += s;

    H.init(s);
    int t = 0;
    for (int i = 1; i < n; ++i)
    {
        int lcp = 0;
        for (int l = 0, r = n - 1; l <= r; )
        {
            int m = (l + r) >> 1;
            if (H.query(t, t + m) == H.query(i, i + m))
            {
                lcp = m;
                l = m + 1;
            }
            else 
            {
                r = m - 1;
            }
        }

        if (lcp == n - 1) continue; /// Equal
        if (s[t + lcp] > s[i + lcp]) t = i;
    }

    return t;
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

2) Logarithm Decomposition

1) Algorithm:

Let a candidate to say a starting position that might leed to lexicographically minimum string rotation. Hence we have the initial candidates are $$$0, 1, \dots, |S| - 1$$$
Let divide the candidates list into parts of size $$$\approx K$$$ (the final part might have much less). There will be $$$\lceil \frac{N}{K} \rceil$$$ parts.
Small iteration: For each parts, we find one smallest circular substring of size $$$K$$$, among all equal circular substrings, pick the one with smallest starting position. Each part will produce only one candidate
Big iteration: For $$$\lceil \frac{N}{K} \rceil$$$ next candidates we will find one smallest circular substring of size $$$N$$$, among all equal circular substrings, pick the one with smallest starting position. This will give you the answer

2) Proofs:

Let $$$A, B$$$ are the circular substrings start from candidates positions, that $$$|A| = |B|$$$. And let $$$X, Y$$$ are the prefixes of $$$A, B$$$, that $$$|X| = |Y|$$$
Since we are comparing in lexicographical order, if $$$X < Y$$$ then it will lead to $$$A < B$$$. Hence by using small iteration we sieved all unoptimal candidates, and reduce $$$N$$$ candidates to $$$\lceil \frac{N}{K} \rceil$$$ candidates only.

3) Complexity:

For small iteration, there are $$$N$$$ candidates, and each candidate are compared with $$$K$$$ lengthed circular substrings using hash. Hence $$$O(N \times log(K))$$$
For big iteration, there are $$$K$$$ candidates, and each candidate are compared with $$$N$$$ lengthed circular substrings using hash. Hence $$$O(\lceil \frac{N}{K} \rceil \times log(N))$$$
The total time complexity is $$$\approx O(N log(K) + \frac{N}{K} log(N))$$$
For optimal purpose, let $$$K \approx \log{N}$$$ the complexity therefore will be only $$$O(N log(log N))$$$

4) Notice:

We are comparing string in circular, you can either use modulo position or duplicated string
Becareful that the real size of string and the duplicated one

5) Implementations:

For convention, let just ignore those obvious not a good code (ugly — complex — stupid code worth nothing to say)

Single Hashing Approach - O(n * log((MOD)) precalculation - O(n log(log n)) time - O(n) space

#include <iostream>
#include <cmath>

using namespace std;

const int LIM = 1e6 + 16;
const int MOD = 1e9 + 7;
const int BASE = 123;
const int LESS    = -1;
const int EQUAL   =  0;
const int GREATER = +1;

/// Calculate (x ^ n) % MOD
int powMOD(int x, int n)
{
    int res = 1;
    for (; n > 0; n >>= 1)
    {
        if (n & 1) res = (1LL * res * x) % MOD;
        x = (1LL * x * x) % MOD;
    }
    
    return res;
}

int H[LIM];
int P[LIM];
/// Calculate hash of (S)
void init(const string &s)
{
    int n = s.size();

    H[0] = 0;
    int b = 1;
    for (int i = 0; i < n; ++i)
    {
        H[i + 1] = (H[i] + 1LL * b * (s[i] - 'a' + 1)) % MOD;
        b = (1LL * b * BASE) % MOD;
    }

    P[n] = powMOD(b, MOD - 2);
    for (int i = n; i >= 1; --i)
        P[i - 1] = (1LL * P[i] * BASE) % MOD;
}

/// Hash value of s[l..r]
int query(int l, int r)
{
    return (1LL * (H[r] - H[l] + MOD) * P[l]) % MOD;
}

/// Compare s[l..l+d-1] vs s[r..r+d-1]
int cmp(const string &s, int i, int j, int d)
{
    int lcp = 0;
    for (int l = 0, r = d - 1; l <= r; )
    {
        int m = (l + r) >> 1;
        if (query(i, i + m) == query(j, j + m))
        {
            lcp = m;
            l = m + 1;
        }
        else 
        {
            r = m - 1;
        }
    }

    if (lcp == d - 1 || s[i + lcp] == s[j + lcp]) return EQUAL;
    return (s[i + lcp] < s[j + lcp]) ? LESS : GREATER;
}

/// Finding minimal string right-rotation to make string minimal lexicographical
int min_cyc(string s)
{
    int n = s.size();
    int k = log(n) + 1;
    s += s;
    init(s);

    int big = 0;
    int upper = n / k + 1;
    for (int part = 0; part < upper; ++part)
    {
        int small = k * part;
        int upper = min(n, k * (part + 1));
        for (int i = k * part; i < upper; ++i)
            if (cmp(s, small, i, k) == GREATER)
                small = i;

        if (cmp(s, big, small, n) == GREATER) 
            big = small;
    }

    return big;
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

Multiple Hashing Approach - O(n * Sigma(log((MOD))) precalculation - O(n log(log n) * k) time - O(n * k) space

#include <iostream>
#include <cmath>

using namespace std;

const int LIM = 1e6 + 16;
const int LESS    = -1;
const int EQUAL   =  0;
const int GREATER = +1;

template<const int MOD, const int BASE>
struct Hash
{
    /// Calculate (x ^ n) % MOD
    int powMOD(int x, int n)
    {
        int res = 1;
        for (; n > 0; n >>= 1)
        {
            if (n & 1) res = (1LL * res * x) % MOD;
            x = (1LL * x * x) % MOD;
        }
        
        return res;
    }

    int H[LIM];
    int P[LIM];
    /// Calculate hash of (S)
    void init(const string &s)
    {
        int n = s.size();

        H[0] = 0;
        int b = 1;
        for (int i = 0; i < n; ++i)
        {
            H[i + 1] = (H[i] + 1LL * b * (s[i] - 'a' + 1)) % MOD;
            b = (1LL * b * BASE) % MOD;
        }

        P[n] = powMOD(b, MOD - 2);
        for (int i = n; i >= 1; --i)
            P[i - 1] = (1LL * P[i] * BASE) % MOD;
    }

    /// Hash value of s[l..r]
    int query(int l, int r)
    {
        return (1LL * (H[r] - H[l] + MOD) * P[l]) % MOD;
    }
};

struct Pack 
{
    int a, b, c, d, e, f;
    Pack (int a = 0, int b = 0, int c = 0, int d = 0, int e = 0, int f = 0)
    : a(a), b(b), c(c), d(d), e(e), f(f) {}

    bool operator == (const Pack &o)
    {
        return (a == o.a)
            && (b == o.b)
            && (c == o.c)
            && (d == o.d)
            && (e == o.e)
            && (f == o.f);
    }
};

struct Multihash 
{
    Hash<1000000007, 123> A;
    Hash<1000000009, 1234> B;
    Hash<1000000021, 12345> C;
    Hash<1000000033, 123456> D;
    Hash<1000000087, 1234567> E;
    Hash<1000000093, 12345678> F;

    void init(const string &s)
    {
        A.init(s);
        B.init(s);
        C.init(s);
        D.init(s);
        E.init(s);
        F.init(s);
    }

    Pack query(int l, int r)
    {
        return Pack(
            A.query(l, r), 
            B.query(l, r),
            C.query(l, r),
            D.query(l, r),
            E.query(l, r),
            F.query(l, r)
        );
    }
};

Multihash H;

/// Compare s[l..l+d-1] vs s[r..r+d-1]
int cmp(const string &s, int i, int j, int d)
{
    int lcp = 0;
    for (int l = 0, r = d - 1; l <= r; )
    {
        int m = (l + r) >> 1;
        if (H.query(i, i + m) == H.query(j, j + m))
        {
            lcp = m;
            l = m + 1;
        }
        else 
        {
            r = m - 1;
        }
    }

    if (lcp == d - 1 || s[i + lcp] == s[j + lcp]) return EQUAL;
    return (s[i + lcp] < s[j + lcp]) ? LESS : GREATER;
}

/// Finding minimal string right-rotation to make string minimal lexicographical
int min_cyc(string s)
{
    int n = s.size();
    int k = log(n) + 1;
    s += s;
    H.init(s);

    int big = 0;
    int upper = n / k + 1;
    for (int part = 0; part < upper; ++part)
    {
        int small = k * part;
        int upper = min(n, k * (part + 1));
        for (int i = k * part; i < upper; ++i)
            if (cmp(s, small, i, k) == GREATER)
                small = i;

        if (cmp(s, big, small, n) == GREATER) 
            big = small;
    }

    return big;
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

IV. Sqrt Decomposition

1) Divide candidate list into many parts

1) Algorithm:

Let a candidate to say a starting position that might leed to lexicographically minimum string rotation. Hence we have the initial candidates are $$$0, 1, \dots, |S| - 1$$$
Let divide the candidates list into parts of size $$$\approx K$$$ (the final part might have much less). There will be $$$\lceil \frac{N}{K} \rceil$$$ parts.
Small iteration: For each parts, we find one smallest circular substring of size $$$K$$$, among all equal circular substrings, pick the one with smallest starting position. Each part will produce only one candidate
Big iteration: For $$$\lceil \frac{N}{K} \rceil$$$ next candidates we will find one smallest circular substring of size $$$N$$$, among all equal circular substrings, pick the one with smallest starting position. This will give you the answer

2) Proofs:

Let $$$A, B$$$ are the circular substrings start from candidates positions, that $$$|A| = |B|$$$. And let $$$X, Y$$$ are the prefixes of $$$A, B$$$, that $$$|X| = |Y|$$$
Since we are comparing in lexicographical order, if $$$X < Y$$$ then it will lead to $$$A < B$$$. Hence by using small iteration we sieved all unoptimal candidates, and reduce $$$N$$$ candidates to $$$\lceil \frac{N}{K} \rceil$$$ candidates only.

3) Complexity:

For small iteration, there are $$$N$$$ candidates, and each candidate are compared with $$$K$$$ lengthed circular substrings. Hence $$$O(N \times K)$$$
For big iteration, there are $$$K$$$ candidates, and each candidate are compared with $$$N$$$ lengthed circular substrings. Hence $$$O(\lceil \frac{N}{K} \rceil \times N)$$$
The total time complexity is $$$O(N \times (K + \lceil \frac{N}{K} \rceil))$$$
For optimal purpose, let $$$K \approx \sqrt{N}$$$ the complexity therefore will be only $$$O(N \sqrt{N})$$$

4) Notice:

We are comparing string in circular, you can either use modulo position or duplicated string
Becareful that the real size of string and the duplicated one

5) Implementations:

For convention, let just ignore those obvious not a good code (ugly — complex — stupid code worth nothing to say)

Detail Sqrt Decomposition Solution - O(N√N) time - O(N) auxiliary space

#include <iostream>
#include <cmath>

using namespace std;

int min_cyc(string s)
{
    int n = s.size(); /// real size of string
    int k = sqrt(n); /// optimal choice
    s += s; /// for convention

    int big = 0; /// One random candidate
    /// Big iteration: Compare string of size K
    for (int part = 0; part < n / k + 1; ++part) 
    {
        int small = part * k; /// One random candidate
        /// Small iteration: Compare string of size K
        for (int i = part * k; i < min(n, (part + 1) * k); ++i) 
        {
            /// Local comparision: Circular Prefixstring starts from here
            if (s.substr(small, k) > s.substr(i, k)) 
            {
                /// Assign new minimal point
                small = i; 
            }
        }

        /// Global comparision: Circular String starts from here
        if (s.substr(big, n) > s.substr(small, n)) 
        {
            /// Assign new minimal point
            big = small;
        }
    }

    /// The final winner
    return big;
}

int main()
{
    /// Input
    string s;
    cin >> s;

    /// Output
    cout << min_cyc(s);
    return 0;
}

Noncomment Sqrt Decomposition Solution - O(N√N) time - O(N) auxiliary space

#include <iostream>
#include <cmath>

using namespace std;

int min_cyc(string s)
{
    int n = s.size(); 
    int k = sqrt(n);
    s += s; 

    int big = 0; 
    for (int part = 0; part < n / k + 1; ++part) 
    {
        int small = part * k; 
        for (int i = part * k; i < min(n, (part + 1) * k); ++i) 
            if (s.substr(small, k) > s.substr(i, k)) 
                small = i; 

        if (s.substr(big, n) > s.substr(small, n)) 
            big = small;
    }

    return big;
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

Optimized Sqrt Decomposition Solution - O(N√N) time - O(1) auxiliary space

#include <iostream>
#include <cmath>

using namespace std;

int min_cyc(const string &s)
{
    int n = s.size(); 
    int k = sqrt(n);

    int big = 0; 
    int upper = n / k + 1;
    for (int part = 0; part < upper; ++part) 
    {
        int small = part * k; 
        int upper = min(n, (part + 1) * k);
        for (int i = part * k; i < upper; ++i) 
        {
            for (int p = 0, l = small, r = i; p < k; ++p, ++l, ++r)
            {
                if (l == n) l = 0;
                if (r == n) r = 0;
                if (s[l]  > s[r]) small = i;
                if (s[l] != s[r]) break;
            }
        }

        for (int p = 0, l = big, r = small; p < n; ++p, ++l, ++r)
        {
            if (l == n) l = 0;
            if (r == n) r = 0;
            if (s[l]  > s[r]) big = small;
            if (s[l] != s[r]) break;
        }
    }

    return big;
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

2) Hash Improvement

1) Idea:

By using hash for comparing two known substrings $$$X$$$, $$$Y$$$ of equal length $$$D$$$. We can compare them from $$$O(S) \rightarrow O(log(S))$$$ by finding their LCP — Longest Common Prefix and comparing $$$X[lcp]$$$ with $$$Y[lcp]$$$

2) Complexity:

The total time complexity is $$$O(N \times log(K) + \lceil \frac{N}{K} \rceil \times log(N)) \approx O(N \times \sqrt{N})$$$ but significant smaller constant (About 2 or/to 3 times faster). But the trade off is more complex code and more space is taken.

3) Optimization:

Like the above Hashing approach
Use constant modulo
Replace vector with array
Replace recursive with iterative
Quick calculating inverse modulo

4) Implementations:

Single Hash Solution - O(N + log(MOD)) preprocessing - O(N√N) time - O(N) auxiliary space

#include <iostream>
#include <cmath>

using namespace std;

const int LIM = 2e5 + 25;
const int MOD = 1e9 + 7;
const int BASE = 123;
const int LESS    = -1;
const int EQUAL   =  0;
const int GREATER = +1;

/// Calculate (x ^ n) % MOD
int powMOD(int x, int n)
{
    int res = 1;
    for (; n > 0; n >>= 1)
    {
        if (n & 1) res = (1LL * res * x) % MOD;
        x = (1LL * x * x) % MOD;
    }
    
    return res;
}

int H[LIM];
int P[LIM];
/// Calculate hash of (S)
void init(const string &s)
{
    int n = s.size();

    H[0] = 0;
    int b = 1;
    for (int i = 0; i < n; ++i)
    {
        H[i + 1] = (H[i] + 1LL * b * (s[i] - 'a' + 1)) % MOD;
        b = (1LL * b * BASE) % MOD;
    }

    P[n] = powMOD(b, MOD - 2);
    for (int i = n; i >= 1; --i)
        P[i - 1] = (1LL * P[i] * BASE) % MOD;
}

/// Hash value of s[l..r]
int query(int l, int r)
{
    return (1LL * (H[r] - H[l] + MOD) * P[l]) % MOD;
}

/// Compare s[l..l+d-1] vs s[r..r+d-1]
int cmp(const string &s, int i, int j, int d)
{
    int lcp = 0;
    for (int l = 0, r = d - 1; l <= r; )
    {
        int m = (l + r) >> 1;
        if (query(i, i + m) == query(j, j + m))
        {
            lcp = m;
            l = m + 1;
        }
        else 
        {
            r = m - 1;
        }
    }

    if (lcp == d - 1 || s[i + lcp] == s[j + lcp]) return EQUAL;
    return (s[i + lcp] < s[j + lcp]) ? LESS : GREATER;
}

/// Finding minimal string right-rotation to make string minimal lexicographical
int min_cyc(string s)
{
    int n = s.size();
    int k = sqrt(n);
    s += s;
    init(s);

    int big = 0;
    int upper = n / k + 1;
    for (int part = 0; part < upper; ++part)
    {
        int small = k * part;
        int upper = min(n, k * (part + 1));
        for (int i = k * part; i < upper; ++i)
            if (cmp(s, small, i, k) == GREATER)
                small = i;

        if (cmp(s, big, small, n) == GREATER) 
            big = small;
    }

    return big;
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

Single Hash Solution - O(N + Sigma(log(MOD))) preprocessing - O(N√N * K) time - O(N * K) auxiliary space

#include <iostream>
#include <cmath>

using namespace std;

const int LIM = 2e5 + 25;
const int LESS    = -1;
const int EQUAL   =  0;
const int GREATER = +1;

template<const int MOD, const int BASE>
struct Hash
{
    /// Calculate (x ^ n) % MOD
    int powMOD(int x, int n)
    {
        int res = 1;
        for (; n > 0; n >>= 1)
        {
            if (n & 1) res = (1LL * res * x) % MOD;
            x = (1LL * x * x) % MOD;
        }
        
        return res;
    }

    int H[LIM];
    int P[LIM];
    /// Calculate hash of (S)
    void init(const string &s)
    {
        int n = s.size();

        H[0] = 0;
        int b = 1;
        for (int i = 0; i < n; ++i)
        {
            H[i + 1] = (H[i] + 1LL * b * (s[i] - 'a' + 1)) % MOD;
            b = (1LL * b * BASE) % MOD;
        }

        P[n] = powMOD(b, MOD - 2);
        for (int i = n; i >= 1; --i)
            P[i - 1] = (1LL * P[i] * BASE) % MOD;
    }

    /// Hash value of s[l..r]
    int query(int l, int r)
    {
        return (1LL * (H[r] - H[l] + MOD) * P[l]) % MOD;
    }
};

struct Pack 
{
    int a, b, c, d, e, f;
    Pack (int a = 0, int b = 0, int c = 0, int d = 0, int e = 0, int f = 0)
    : a(a), b(b), c(c), d(d), e(e), f(f) {}

    bool operator == (const Pack &o)
    {
        return (a == o.a)
            && (b == o.b)
            && (c == o.c)
            && (d == o.d)
            && (e == o.e)
            && (f == o.f);
    }
};

struct Multihash 
{
    Hash<1000000007, 123> A;
    Hash<1000000009, 1234> B;
    Hash<1000000021, 12345> C;
    Hash<1000000033, 123456> D;
    Hash<1000000087, 1234567> E;
    Hash<1000000093, 12345678> F;

    void init(const string &s)
    {
        A.init(s);
        B.init(s);
        C.init(s);
        D.init(s);
        E.init(s);
        F.init(s);
    }

    Pack query(int l, int r)
    {
        return Pack(
            A.query(l, r), 
            B.query(l, r),
            C.query(l, r),
            D.query(l, r),
            E.query(l, r),
            F.query(l, r)
        );
    }
};

Multihash H;

/// Compare s[l..l+d-1] vs s[r..r+d-1]
int cmp(const string &s, int i, int j, int d)
{
    int lcp = 0;
    for (int l = 0, r = d - 1; l <= r; )
    {
        int m = (l + r) >> 1;
        if (H.query(i, i + m) == H.query(j, j + m))
        {
            lcp = m;
            l = m + 1;
        }
        else 
        {
            r = m - 1;
        }
    }

    if (lcp == d - 1 || s[i + lcp] == s[j + lcp]) return EQUAL;
    return (s[i + lcp] < s[j + lcp]) ? LESS : GREATER;
}

/// Finding minimal string right-rotation to make string minimal lexicographical
int min_cyc(string s)
{
    int n = s.size();
    int k = sqrt(n);
    s += s;
    H.init(s);

    int big = 0;
    int upper = n / k + 1;
    for (int part = 0; part < upper; ++part)
    {
        int small = k * part;
        int upper = min(n, k * (part + 1));
        for (int i = k * part; i < upper; ++i)
            if (cmp(s, small, i, k) == GREATER)
                small = i;

        if (cmp(s, big, small, n) == GREATER) 
            big = small;
    }

    return big;
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

V. KMP Approach

A) Normal KMP

1) Algorithm:

In this problem, there is no effeciency using normal KMP over all pair substring comparing, even if we divide into queries and optimize it. It still have no improvement compare to all other implementations
But if you want to compare two strings by KMP, then here it is
We find their first different position using KMP of two strings or KMP on concatenation string ($$$s$$$ + $$$c$$$ + $$$t$$$) — for such $$$c$$$ is smaller than any character in $$$s$$$ and $$$t$$$.

2) Implementations:

My Simple KMP Implementations - O(n) time - O(1) auxiliary space

void KMP(const string &s, vector<int> &K)
{
    K.assign(s.size(), 0);
    for (int l = 1, r = 1; r < s.size(); ++r)
    {
        for (l = K[r - 1]; l && s[l] != s[r]; l = K[l - 1]);
        K[r] = l + (s[l] == s[r]);
    }
}

Bruteforces using KMP - O(n^2) time - O(n) auxiliary space

#include <algorithm>
#include <iostream>
#include <vector>

using namespace std;

vector<int> KMP(const string &s)
{
    vector<int> K(s.size());
    for (int l = 1, r = 1; r < s.size(); ++r)
    {
        for (l = K[r - 1]; l && s[l] != s[r]; l = K[l - 1]);
        K[r] = l + (s[l] == s[r]);
    }
    return K;
}

int min_cyc(string s)
{
    int n = s.size();
    s += s;

    int p = 0;
    for (int i = 1; i < n; ++i)
    {
        vector<int> K = KMP(s.substr(p, n) + "#" + s.substr(i, n));
        int k = 0; /// First difference character
        while (k < n && K[k + n] + 1 == K[k + n + 1]) ++k;
        if (k < n && s[p + k] > s[i + k]) p = i;
    }

    return p;
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

B) Booth Algorithm

1) Idea:

Remember when you compare circular substrings using hashing ? We find first difference character $$$k$$$ by binary search and compare only at that position to know whether one is greater, equal or less than other. But this give you huge hidden constant that might lead you TLE (the Optimized version doesnt provides that much but its constant is not that small too). While we can take the advantage of $$$KMP$$$ to find its first difference too, in a much smaller constant ofcourse. But we have to do some clever tricks to ignore its high complexity.

2) Code Explanation:

For convention let duplicate the initial string $$$S + S$$$. We can easily find the prefix function for the string. We have to find that at position $$$i > n$$$, which position $$$p$$$ that the initial one and the current rotation differ, then compare the two rotations. However it can only provides comparision between inital rotation and the current rotation, if you want for all rotations you have to merge all $$$N$$$ circular substrings. But, since we only care about the minimal lexicographical, we can just immediately eliminate the worse rotation when you find a better rotation. Then we treat initial rotation by this new better a rotation, but by how can we do that ? The important is that we cut the relationship of the current prefix function with the previous one, by assign the previous as $$$-1$$$, hence we are comparing the prefix function for this current rotation. This allow the algorithm complexity drop from normal KMP $$$O(n^2)$$$ to Booth-KMP $$$O(n)$$$ and way much smaller constant than Hashing $$$O(n log n)$$$.

3)* Implementations:

Detail Booth Algorithm - O(n) time - O(n) auxiliary space

#include <iostream>
#include <vector>

using namespace std;

/// Finding minimal string right-rotation to make string minimal lexicographical
int min_cyc(string s)
{
    /// Prefix Function, not for the inital but the current one
    vector<int> f(s.size(), -1); /// -1 means KMP function stop right there
    s += s; /// Duplicating for convention

    int p = 0; /// The current rotation we are compare with
    for (int l = 1, r = 1; r < s.size(); ++r) 
    {
        /// KMP function
        ///
        /// l = f[r - p - 1]      <=       l = K[K[r] - 1]
        /// Latest KMP value              Latest KMP value
        ///
        ///           l != 0 && s[l] != s[r]
        ///    non-ended KMP and non-equal character 
        ///            V    V    V    V    V
        ///       l != -1 && s[p + l + 1] != s[r]          
        /// non-ended KMP and non-equal character at pth rotation            
        ///
        ///  l = f[l]              <=        l = K[l - 1]
        /// Previous KMP                     Previous KMP 
        /// For convention we just made (l) difference a little bit
        ///

        for (l = f[r - p - 1]; l != -1 && s[p + l + 1] != s[r]; l = f[l])
        {
            if (s[l + p + 1] > s[r]) /// pth rotation is bigger than current rotation
            {  
                p = r - l - 1;
            }
        }

        /// KMP not ended here and this is the first different character between pth rotation and current rotation
        if (l == -1 && s[l + p + 1] != s[r])
        {
            if (s[p + l + 1] > s[r]) /// pth rotation is bigger than current rotation
            {
                p = r;
            }
                           /// f[r - p] <=> K[K[r]]
            f[r - p] = -1; /// KMP must ended here since we compared the rotation
        }
        else
        {
            f[r - p] = l + 1; /// Normal KMP where "(s[l] = s[r]) = (1)"
        }
    }
    
    /// Return the rotation
    return p;
}
    
int main()
{
    /// Input
    string s;
    cin >> s;

    /// Output
    cout << min_cyc(s);
    return 0;
}

Non-comment Booth Algorithm - O(n) time - O(n) auxiliary space

#include <iostream>
#include <vector>

using namespace std;

int min_cyc(string s)
{
    int p = 0;
    s += s;

    vector<int> f(s.size(), -1);
    for (int l = 1, r = 1; r < s.size(); ++r) 
    {
        for (l = f[r - p - 1]; l != -1 && s[p + l + 1] != s[r]; l = f[l])
            if (s[l + p + 1] > s[r])
                p = r - l - 1;

        if (l == -1 && s[p + l + 1] != s[r])
        {
            if (s[p + l + 1] > s[r])
                p = r;

            f[r - p] = -1;
        }
        else
        {
            f[r - p] = l + 1;
        }
    }
    
    return p;
}
    
int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

VI. Lyndon Factorization Approach

A) Approach

1) Definitions:

For detail definitions, proves, applications, implementations, you can read Simple Linear and Effectively Duval Algorithm for Lyndon Factorization
Lyndon word is nonempty string that is strictly smaller in lexicographic order than all of its rotations.
Lyndon factorization is a unique way to split the string into many lyndon words in such a way that the words in the sequence are in non-increasing order. That means we factorize $$$s = s_1 + s_2 + \dots + s_k$$$ where $$$s_1, s_2, \dots, s_k$$$ are lyndon words and in non-increasing order ($$$s1 \geq s2 \geq \dots \geq s_k$$$)
From the above definition, we con conclude that the last factor in lyndon factorization of the string itself is minimal lexicographical.

2) Duval Algorithm:

In $$$1983$$$, Duval provides an effecient algorithm for listing the Lyndon words of length at most $$$n$$$ with a given alphabet size $$$s$$$ in lexicographic order in $$$O(n)$$$

For string $$$S$$$ of length $$$n$$$
For each new word $$$x$$$ from $$$S$$$ that $$$\forall i$$$ we have $$$x[i] = s[i\ mod\ n]$$$ (mean $$$x$$$ is a sub-string of some cyclic strings that shifted from initial $$$S$$$)
While the last symbol of $$$x$$$ is in sorted order, remove it to make a shorter word
Replace the last remained symbol of $$$x$$$ by its successor in sorted order.

3) Examples:

$$$S = $$$"$$$abaabaaabaababaaabaaababaab$$$"

Lyndon Factorization

4) Notice:

Wait wait wait, are you going to take the head position of last factor of string factorization and it will be the answer ?
Nope, becaure the string are circular, you must duplicate the string before doing so, else there might exist such string that start from from right but connect with left to form a minimal lexicographical string rotation.

5) Implementations:

Detail Duval Solution - O(n) Time - O(n) Auxiliary Space

The idea is that when we factorize duplicated string $$$t = s + s$$$
Then the answer will be a substring of maximum starting position $$$p$$$ not exceed $$$|s|$$$
The proves is already inside the code

#include <algorithm>
#include <iostream>

using namespace std;

/// Find starting position of minimum acyclic string in (s)
int min_cyc(string s)
{
    int n = s.size(); /// the real size of the string
    s += s; /// for convention since we are deadling with acyclic

    ///
    /// s = s1 + s2 + s3
    /// s1 = s[1..l-1] is handled
    /// s2 = s[l..r]   is handling
    /// s3 = s[p..n]   is going to be handled
    /// 

    int res = 0; /// minimum acyclic string
    /// while (s2) is a lyndon word, try to add s2 with s[p]
    for (int l = 0; l < n; )
    {
        ///
        /// - Case 1: 
        ///     If (s) is fully ordered, then return 0
        ///     Surely will this loop make [l..r] = [0..n-1]
        ///     Ans it is currently that (l = 0) 
        ///     => res = l is a correct answer
        ///
        /// - Case 2:
        ///     Minimum acyclic string s' = s[l..r] that 0 <= l < n <= r < 2n
        ///     Also if s2 is s', then the loop will extend its (r >= n)
        ///     Since l < n, the latest (l) will create s'    
        ///     => res = l is a correct answer
        /// 
        /// Hence in both cases, res = last(l) will return a correct answer
        ///
        res = l;

        /// Extend as much as possible lyndon word s2 = s[l..r]
        int r = l, p = l + 1;
        while (p < s.size())
        {
            /// (s2 + s[p]) is not a lyndon word
            if (s[r] > s[p]) 
            {
                break;
            }

            /// (s2 + s[p]) is stil a lyndon word, hence extend s2
            if (s[r] == s[p]) 
            {
                ++r;
                ++p;
                continue;
            }
            
            /// (s2 + s[p]) is a lyndon word, but it may be a repeated string
            if (s[r] < s[p]) 
            {
                r = l;
                ++p;
                continue;
            }
        }

        /// The lyndon word may have the form of s2 = sx + sx + .. + sx like "12312123"
        while (l <= r) 
        {
            /// s[l..l + p - r] is sx
            l += p - r;
        }
    }

    /// Dont forget to return the value ;)
    return res;
}

/// If you wanna know about that minimum acyclic string
string cyc(string s, int k)
{
    rotate(s.begin(), s.begin() + k, s.end());
    return s;
}

int main()
{
    /// Input 
    string s;
    cin >> s;

    /// Output
    cout << min_cyc(s) << '\n';
//  cout << cyc(s, min_cyc(s));
    return 0;
}

None-comment Duval Solution - O(n) Time - O(n) Auxiliary Space

#include <iostream>

using namespace std;

int min_cyc(string s)
{
    int n = s.size();
    s += s;

    int res = 0;
    for (int l = 0; l < n; )
    {
        res = l;
        int r = l, p = l + 1;
        for (; p < s.size() && s[r] <= s[p]; ++r, ++p)
            if (s[r] < s[p]) r = l - 1;

        while (l <= r) l += p - r;
    }

    return res;
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s) << '\n';
    return 0;
}

6) Optimization:

We dont need to duplicate the string
We dont need to continue the function when $$$S_2$$$ has the size of $$$n$$$ (or when starting point of $$$S_2$$$ $$$< n \leq$$$ ending point of $$$S_2$$$)
We just skip those cases $$$S_2 = s_x + s_x + \dots + s_x + s_y$$$ but jump directly to the next character

Optimized Duval Solution - O(n) Time - O(1) Auxiliary Space

#include <iostream>
    
using namespace std;
    
int min_cyc(const string &s) 
{
    int n = s.size();
    int res = 0;
    for (int l = 0; l < n; )
    {
        res = l;
        int r = l, p = l + 1;
        for (; r < n; ++r, ++p) /// If there is such string found, then its length wont exceed |s|
        {
            char c = (p < n) ? s[p] : s[p - n]; /// to avoid modulo
            if (s[r] > c) break;
            if (s[r] < c) r = l - 1;
        }        
        l = max(r, l + p - r); /// just skip those (s2 = sx + sx + ... + sx + sy) cases
    }
    return res;
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

VII. Suffix Array Approach

A) Approach

1) Wrong Idea:

Store all suffix of $$$S$$$ (with its index) into an array then sort it (in lexicographical order).

Let $$$S = $$$"$$$baabaa$$$" then there are such suffixes: "$$$a$$$", "$$$aa$$$", "$$$baa$$$", "$$$abaa$$$", "$$$aabaa$$$", "$$$baabaa$$$"

Since the first having smallest lexicographical order among all suffixes then its index must be the answer ?

2) Prove it wrong:

We are dealing with circular string, and the normal suffix doesnt have enough information.
Let $$$A, B$$$ are some different suffixes, and $$$A < B$$$, but if you extend both by size $$$n$$$ then ($$$A < B$$$) might no longer be true, but because of the lexicographical sorting order and all suffixes here having different size, therefore the one with smaller size is considered as "smaller".

3) Correct idea:

Sort all suffixes of $$$S + S$$$ with its index
Find smallest suffix whose index in $$$0 \dots |s| - 1$$$
Compare with other suffixes to find one with smaller index

4) Algorithm:

For convention

Let smaller / biggers means in lexicographical order
Let equal means two suffixes have their first $$$n$$$ characters equal (first means prefix of suffix)
Let $$$T = S + S + c$$$ ($$$c$$$ is smaller than any character in $$$S$$$)

First we store all suffixes of $$$T$$$ (with its index) into an array then sort it.
We can easily find the first suffix $$$X$$$ whose index $$$t$$$ is between ($$$0 \dots n - 1$$$) (those suffixes whose length are at least $$$n$$$)
Since there might be some other equal suffixes of size $$$n$$$ whose index is smaller, we compare strings and find it

5) Optimizations:

Since all suffixes are sorted and we only need to care smallest suffixes (all among them we select the one with smallest index)
Hence we only need to care about consecutive suffixes with $$$X$$$ (they are either bigger than $$$X$$$ or equal to $$$X$$$ but with smaller index)

Let $$$Y$$$ is the current suffix (consecutive to right of $$$X$$$ of course)
If $$$X = Y$$$ then $$$Y$$$ must have smaller index, hence we update the result
Otherwise since $$$Y > X$$$, hence we break

6) Examples:

Let $$$SA[]$$$ (Suffix Array) is the array which stores order of suffixes

$$$SA[i] < SA[j]$$$ means suffix start from $$$i$$$ is smaller than suffix start from $$$j$$$

Let $$$LCP[]$$$ (Longest Common Prefix) is the array with store the longest common prefix between two consecutive suffixes in the order of $$$SA[]$$$

$$$LCP[x] = k$$$ means suffix start from $$$SA[x]$$$ and $$$SA[x + 1]$$$ have their longest common prefix is $$$k$$$

For $$$S = $$$"$$$aaba$$$"

Example

At (0): SA[i] = 8 | LCP[i] = 0 | Suffix we care: <p>Unable to parse markup [type=CF_MATHJAX]</p>$$$ | Real suffix = $$$\$$
At (1): SA[i] = 7 | LCP[i] = 0 | Suffix we care: a$   | Real suffix = a$
At (2): SA[i] = 3 | LCP[i] = 1 | Suffix we care: aaab | Real suffix = aaaba$
At (3): SA[i] = 4 | LCP[i] = 2 | Suffix we care: aaba | Real suffix = aaba$
At (4): SA[i] = 0 | LCP[i] = 0 | Suffix we care: aaba | Real suffix = aabaaaba$
At (5): SA[i] = 5 | LCP[i] = 1 | Suffix we care: aba$ | Real suffix = aba$
At (6): SA[i] = 1 | LCP[i] = 3 | Suffix we care: abaa | Real suffix = abaaaba$
At (7): SA[i] = 6 | LCP[i] = 0 | Suffix we care: ba$  | Real suffix = ba$
At (8): SA[i] = 2 | LCP[i] = 2 | Suffix we care: baaa | Real suffix = baaaba$

t = 2
answer = 3

7) Notices:

With $$$T = S + S$$$ instead of $$$T = S + S + c$$$, there are actually have plentiful implementations to make algorithm right, even only $$$T = S$$$ there are ways to solve the problem too. But they might much more complex, slow or just correct for this problem that you should be careful.
Smallest lexicographical suffix might not have smallest index

8) Implementations:

There are many ways to implement suffix array (I wont go deeper)

Spoiler

$$$O(n^2 log n)$$$ by storing and sorting by the whole suffix
$$$O(n^2)$$$ by improving the sorting
$$$O(n log^2(n))$$$ by only provide positions and sort by the $$$2^0, 2^1, \dots, 2^k \approx n$$$ character
$$$O(n log(n))$$$ by using radix sort instead
$$$O(n log(log(n)))$$$ by improving how you sort further deep
$$$O(n)$$$ by using DC3 algorithm

Simple Suffix Array Construction - O(n^2 log n) time - O(n) space

When you have little time less to code

#include <algorithm>
#include <iostream>
#include <numeric>
#include <vector>
 
using namespace std;

#define all(x) (x).begin(), (x).end()
vector<int> cal_SA(const string &s)
{
    vector<int> p(s.size());
    iota(all(p), 0);
    sort(all(p), [&](int x, int y) { return s.substr(x, s.size() - x) < s.substr(y, s.size() - y); });
    return p;
}

int main()
{
    string s;
    cin >> s;
    s += '$';

    vector<int> SA = cal_SA(s);
    for (int x : SA) 
    {
        string suffix = s.substr(x);
        cout << x << ' ' << suffix << endl;
    }

    return 0;
}

Simple Improved Suffix Array Construction - O(n log^2(n)) time - O(n) space

When you have little time less to code but not to TLE

#include <algorithm>
#include <iostream>
#include <vector>
 
using namespace std;

typedef long long ll;
#define all(x) (x).begin(), (x).end()
vector<int> cal_SA(const string &s)
{
    const ll n = s.size();
    vector<int> p(n);
    vector<ll> c(n), d(n);
    for (int i = 0; i < n; ++i) p[i] = i, c[i] = s[i];

    for (int k = 0; k < n; k = !k ? 1 : k <<= 1)
    {
        for (int i = 0; i < n; ++i) d[i] = n * c[i] + c[(i + k) % n];
        sort(all(p), [&](int i, int j) { return d[i] < d[j]; });
 
        c[p[0]] = 0;
        for (int i = 1; i < n; ++i)
            c[p[i]] = c[p[i - 1]] + (d[p[i]] != d[p[i - 1]]);
    }
 
    return p;
}
int main()
{
    string s;
    cin >> s;
    s += '$';
    
    vector<int> SA = cal_SA(s);
    for (int x : SA) 
    {
        string suffix = s.substr(x);
        cout << x << ' ' << suffix << endl;
    }

    return 0;
}

Optimized Suffix Array Construction - O(n log(n)) time - O(n) space

#include <algorithm>
#include <iostream>
#include <numeric>
#include <vector>
 
using namespace std;

typedef long long ll;
#define all(x) (x).begin(), (x).end()
vector<int> cal_SA(const string &s)
{
    int n = s.size();
    vector<int> f(2 * n), p(n), c(n);
    for (int i = 0; i < n; ++i)
        f[i] = f[i + n] = i;
        
    { /// k = 0
        /// sort one character
        iota(all(p), 0);
        stable_sort(all(p), [&](int i, int j) { return s[i] < s[j]; });

        /// assign initial order
        c[p[0]] = 0;
        for (int i = 1; i < n; ++i)
            c[p[i]] = c[p[i - 1]] + (s[p[i]] != s[p[i - 1]]);
    }
 
    vector<int> cn(n), pn(n), cnt(n + 1);
    for (int k = 1; k < n; k <<= 1)
    {
        /// radix sort
        for (int &x : p) x = f[x - k + n];
        fill(all(cnt), 0);
        for(int x : c) cnt[x + 1]++;
        partial_sum(all(cnt), cnt.begin());
        for(int x : p) pn[cnt[c[x]]++] = x;
        swap(p, pn);
 
        /// assign new order
        cn[p[0]] = 0;
        for (int i = 1; i < n; ++i)
        {
            int l = p[i - 1], r = p[i];
            cn[r] = cn[l] + (c[l] != c[r] || c[f[l + k]] != c[f[r + k]]);
        }
        swap(c, cn);
    }
    
    return p;
}

int main()
{
    string s;
    cin >> s;
    s += '$';
    
    vector<int> SA = cal_SA(s);
    for (int x : SA) 
    {
        string suffix = s.substr(x);
        cout << x << ' ' << suffix << endl;
    }

    return 0;
}

There are many ways to implement longest common prefix: (I wont go deeper)

Spoiler

Here are main implementations

Simple Optimized Implementation - O(n log n) SA - O(n) LCP - O(n log n) total time - O(n) auxiliary space

#include <algorithm>
#include <iostream>
#include <numeric>
#include <vector>
 
using namespace std;
 
#define all(x) (x).begin(), (x).end()
vector<int> cal_SA(const string &s)
{
    int n = s.size();
    vector<int> f(2 * n), p(n), c(n);
    for (int i = 0; i < n; ++i)
        f[i] = f[i + n] = i;
        
    { /// k = 0
        iota(all(p), 0);
        stable_sort(all(p), [&](int i, int j) { return s[i] < s[j]; });

        c[p[0]] = 0;
        for (int i = 1; i < n; ++i)
            c[p[i]] = c[p[i - 1]] + (s[p[i]] != s[p[i - 1]]);
    }
 
    vector<int> cn(n), pn(n), cnt(n + 1);
    for (int k = 1; k < n; k <<= 1)
    {
        for (int &x : p) x = f[x - k + n];
        fill(all(cnt), 0);
        for(int x : c) cnt[x + 1]++;
        partial_sum(all(cnt), cnt.begin());
        for(int x : p) pn[cnt[c[x]]++] = x;
        swap(p, pn);
 
        cn[p[0]] = 0;
        for (int i = 1; i < n; ++i)
        {
            int l = p[i - 1], r = p[i];
            cn[r] = cn[l] + (c[l] != c[r] || c[f[l + k]] != c[f[r + k]]);
        }
        swap(c, cn);
    }
    
    return p;
}
 
vector<int> cal_LCP(const string &s, const vector<int> &p)
{
    int n = s.size();
    vector<int> LCP(n), c(n);
    for (int i = 0; i < n; i++) c[p[i]] = i;
    for (int i = 0, k = 0; i < n; ++i) if (p[i] != n - 1)
    {
        for (int j = p[c[i] - 1]; s[i + k] == s[j + k]; ++k);
        LCP[c[i]] = k;
        if (k) --k;
    }
 
    return LCP;
}
 
int min_cyc(const string &s)
{
    if (count(all(s), s[0]) == s.size()) return 0;
    int n = s.size();
    vector<int> SA = cal_SA(s + s + '$');
    vector<int> LCP = cal_LCP(s + s + '$', SA);

    for (int i = 0; i < 2 * n + 1; ++i) if (SA[i] < n)
    {
        while (i <= 2 * n && SA[i] > SA[i + 1] && LCP[i + 1] >= n) ++i;
        return SA[i];
    }
}
 
int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
}

VIII. Suffix Automation Approach

A) Approach

1) Algorithm:

Construct Suffix Automaton of $$$s + s$$$. The automaton itself is a path graph of all string rotations
Inorder to get lexicographically minimal string rotation, we traverse from initial state $$$|s|$$$ times by smallest edge (in lexicographical order), Let this state is $$$X$$$
For inding its index, we find the first occurence of minal strings, equivalence to first occurence of state $$$X$$$. We construct occurences of each state. Let first occurence of state $$$P$$$ is $$$f(P)$$$
Let not go deeper in construction here. We have least rotation $$$r = f(X) - |s| + 1$$$

2) Implementations:

Suffix Automaton Solution - O(n log(alphabet)) time - O(n) auxiliary space

#include <iostream>
#include <vector>
#include <map>

using namespace std;

struct state 
{
    // leng : longest lengh of all strings in the pth class
    // minp : minimum position of occurence
    // link : provide you the parent of sa_size
    // edge : labeled edge from node sa_size

    int leng;
    int minp;
    int link;
    map<char, int> edge;
};

const int LIM = 2e6 + 26;
state sa[LIM * 2];
int sa_size;
int last;

void construct(const string &s) 
{
    /// Initialization
    last = 0;
    sa[0].leng = 0;
    sa[0].minp = -1;
    sa[0].link = -1;
    sa_size = 1;

    /// Extend suffix automaton
    for (char c : s)
    {
        /// Make new state
        int cur = sa_size++;
        sa[cur].leng = sa[last].leng + 1;
        sa[cur].minp = sa[cur].leng - 1;

        int u = last;
        /// Find such state (u) linked to (last)
        for (; u != -1 && !sa[u].edge.count(c); u = sa[u].link)
            sa[u].edge[c] = cur;

        last = cur;
        if (u == -1) continue; /// (last) is linked with inital state only
        
        int v = sa[u].edge[c];
        if (sa[u].leng + 1 == sa[v].leng) /// we dont split state (v) here
        {
            sa[cur].link = v;
            continue;
        } 
         
        /// Split state (v) by making its clone (v')
        int nxt = sa_size++;
        sa[nxt].leng = sa[u].leng + 1;
        sa[nxt].minp = sa[v].minp;
        sa[nxt].link = sa[v].link;
        sa[nxt].edge = sa[v].edge;

        /// Find such state linked with it
        sa[v].link = sa[cur].link = nxt;
        for (; u != -1 && sa[u].edge[c] == v; u = sa[u].link)
            sa[u].edge[c] = nxt;
    }
}

int min_cyc(const string &s)
{
    /// Construct Suffix Automata of all string rotations
    construct(s + s);

    /// Find lexicographically minimal string rotation
    int root = 0;
    for (int i = 0; i < s.size(); ++i)
    {
   /// cout << sa[root].edge.begin() -> first; /// if you want to output such string
        root = sa[root].edge.begin() -> second;
    }

    /// Find its first occurence
    return sa[root].minp - s.size() + 1;
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

IX. Elimination Tournament Algorithm

So here are my other approachs for this problem. I dont know whether there are names for these algorithm. I just got the ideas from soccer elimination and apply to this problem. By self-researching and computing, finaly I archive linear algorithm
About the algorithm name, if one know any paper about it before, please tag me in and I will tag the first authurs of those/these algorithms
The simplest idea is that we have a list of candidate, and keep doing elimination until we have the final champion

A) Dual Elimination

1) Idea:

So the ideas is that for the initial candidate list, we will select 2 consecutive candidates, then compare two circular strings of the length equal to the minimum gap of the two candidates

2) Algorithm:

Let $$$t = s + s$$$, and the selected candidates are $$$l$$$ and $$$r$$$. For convention, let $$$l < r$$$, then we will compare $$$A$$$ and $$$B$$$, for which $$$A = t[l \dots l+(r-l)-1]$$$ and $$$B = t[r \dots r+(r-l)-1]$$$.

Case 1: $$$A < B$$$, we will select the starting point of $$$A$$$ become the next candidate, it is $$$l$$$
Case 2: $$$A > B$$$, we will select the starting point of $$$B$$$ become the next candidate, it is $$$r$$$
Case 3: $$$A = B$$$, we will select the smaller starting point of $$$A$$$ and $$$B$$$, it is $$$min(l, r)$$$

3) Example:

$$$S = $$$"$$$abaabaaabaababaaabaaababaab$$$"

Round I - 27 candidates

Candidate: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    abaabaaabaababaaabaaababaab
Dual:      AABBCCDDEEFFGGHHIIJJKKLLMM?
Compare:   
A: a < b 
B: a = a
C: b > a
D: a = a 
E: b > a 
F: a < b 
G: a < b 
H: a = b
I: a < b
J: a = b
K: a < b
L: a < b 
M: a = b
?: b ?

Round II - 13 candidates

Candidate: 0 2 5 6 9 12 14 16 18 20 22 24 26
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    a_a__aa__a__a_a_a_a_a_a_a_b
Dual:      A A  BB  C  C D D E E F F ?
Compare:   
A: ab > aa 
B: aa = aa
C: aa < ab
D: aa < ab
E: aa < ab
F: ab > aa
?: b ?

Round III - 7 candidates

Candidate: 2 5 9 14 18 24 26
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    __a__a___a____a___a_____a_b
Dual:        A  A   B    B   C     C ?
Compare:   
A: aab > aaa 
B: aabab > aaaba
C: aaabab < aababa
?: b ?

Round IV - 4 candidates

Candidate: 5 14 18 26
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    _____a________a___a_______b
Dual:           A        A   B       B
Compare:   
A: aaabaab > aaabaaa 
B: aaabab < babaab

Round V - 2 candidates

Candidate: 14 18
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    ______________a___a________
Dual:                    A   A
Compare:   
A: aaab = aaab

Champion


Winner:  14 - V
Number:  012345678901234567890123456
String:  abaabaaabaababaaabaaababaab
Remain:  ______________a____________
Dual:                  *   
Compare: Stopped

4) Notice:

We are comparing circular string !

5) Complexity:

Normal Version: Stable Complexity
Optimized Version: Linear when characters are unique
At the $$$k$$$ elimination, the number of participants will be about $$$O(\frac{n}{2^k})$$$
At the $$$k$$$ elimination, each string will be compared $$$O(2^k)$$$ times
Hence the total complexity is $$$O(\overset{\lceil log_n \rceil}{\underset{k = 1}{\Sigma}}(\frac{n}{2^k} \times 2^k))$$$ $$$=$$$ $$$O(n \times 1 + \frac{n}{2} \times 2 + \dots + \frac{n}{2^{\lceil log_2(n) \rceil}} \times \lceil log_2(n) \rceil)$$$ $$$=$$$ $$$O(n \times \lceil log_2(n) \rceil)$$$ $$$=$$$ $$$O(n\ log\ n)$$$

6) Implementations:

Detail Dual Elimination Algorithm - O(n log n) time - O(n) auxiliary space

#include <iostream>
#include <vector>

using namespace std;

/// Finding minimal string right-rotation to make string minimal lexicographical
int min_cyc(string s)
{
    int n = s.size(); /// For convention: n = |s|

    /// Candidate list
    vector<int> a(n);
    for (int i = 0; i < n; ++i)
        a[i] = i; /// Acyclic string start from (i) might the one with minimal lexicographical

    /// For convention
    s += s;

    /// Keep doing dual-elimination until we have the final champion
    while (a.size() > 1)
    {
        /// Next Candidate List
        vector<int> b;
        for (int i = 0; i + 1 < a.size(); i += 2)
        {
            /// Notice that for convention, l < r
            int l = a[i];
            int r = a[i + 1];

            /// Compare substring of size d = r - l
            /// -  Left string: A = s[l...l+d]
            /// - Right string: B = s[r...r+d]
            ///
            /// # Case (A < B): Left string is smaller 
            ///   > We will pick (a[i]) as a winner
            ///
            /// # Case (A > B): Right string is smaller
            ///   > We will pick (a[i + 1]) as a winner
            /// 
            /// # Case (A = B): Both string are equal
            ///   > We will pick one whose starting point is smaller
            ///

            if (s.substr(l, r - l) <= s.substr(r, r - l))
                b.push_back(l);
            else 
                b.push_back(r);
        }

        /// The remain candidate that not join the elimination
        if (a.size() & 1) b.push_back(a.back());
        
        /// Do the next elimination
        swap(a, b);
    }

    /// The final winner
    return a.front();
}

int main()
{
    /// Input
    string s;
    cin >> s;

    /// Output
    cout << min_cyc(s);
    return 0;
}

Noncomment Dual Elimination Algorithm - O(n log n) time - O(n) auxiliary space

#include <iostream>
#include <vector>

using namespace std;

int min_cyc(string s)
{
    int n = s.size();
    vector<int> a(n);
    for (int i = 0; i < n; ++i)
        a[i] = i;

    s += s;
    while (a.size() > 1)
    {
        vector<int> b;
        for (int i = 0; i + 1 < a.size(); i += 2)
        {
            int l = a[i];
            int r = a[i + 1];
            if (s.substr(l, r - l) <= s.substr(r, r - l))
                b.push_back(l);
            else 
                b.push_back(r);
        }

        if (a.size() & 1) b.push_back(a.back());
        swap(a, b);
    }

    return a.front();
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

Optimized and Simple Dual Elimination Algorithm - O(n log n) time - O(n) auxiliary space

#include <algorithm>
#include <iostream>
#include <deque>

using namespace std;

int min_cyc(const string &s)
{
    char c = *min_element(s.begin(), s.end());
    int n = s.size();
    deque<int> S; 
    for (int i = 0; i < n; ++i)
        if (s[i] == c)
            S.push_back(i); 

    while (S.size() > 1)
    {
        int a = S.front(); S.pop_front();
        int b = S.front(); S.pop_front();
        if (a > b) swap(a, b);
        int res = min(a, b);
        for (int l = a, r = b, p = r - l; p >= 1; --p, ++l, ++r)
        {
            if (l == n) l = 0;
            if (r == n) r = 0;
            if (s[l] < s[r]) { res = a; break; }
            if (s[l] > s[r]) { res = b; break; }
        }
        S.push_back(res);
    }

    return S.back();
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

B) Substring Based Elimination

1) Idea:

From inital set of candidates, keep comparing substrings and eliminate the bigger one until we get the final champion

2) Algorithm:

First, let $$$mn$$$ / $$$mx$$$ is the minimal / maximal character of $$$s$$$

Define: $$$mn = \underset{c \in S}{min}(c)$$$ and $$$mx = \underset{c \in S}{max}(c)$$$

Pre-elimination Round: We take the position $$$p$$$ that $$$s[p] = mn$$$. Since all other positions wont provide Minimal Lexicographical Circular String

Define: $$$candidate = $$$ { $$$p\ \ |\ \ p \in $$$ { $$$0, 1, \dots, |s| - 1$$$ } $$$ \cap s[p] = mn$$$ }

Then we take maximumly $$$n - 1$$$ Rounds from $$$d = 1$$$ to $$$d = n - 1$$$. For all current candidate, add the next character to it. Find the minimal substring, and eliminater unsatisfied others.

3) Optimization

Optimization: At the $$$p$$$ round we will find $$$c = $$$ minimal character among all candidate's $$$pth$$$-next character, then traverse again and eliminate the one whose $$$pth$$$-nxt character is greater than $$$c$$$
Define: $$$s[x] = s[x\ mod\ |s|]$$$ and $$$c = \underset{p \in\ candidate}{min}(s[p + d])$$$ and $$$next\ candidate = $$$ { $$$p\ \ |\ \ p \in candidate \cap s[p + d] = c$$$ }

4) Example:

$$$S = $$$"$$$abaabaaabaababaaabaaababaab$$$"

Pre-elimination - 26 candidates

Candidate: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    abaabaaabaababaaabaaababaab
Candidate Substring:
0 - a 
1 - b
2 - a
3 - a
4 - b
5 - a
6 - a
7 - a
8 - b
9 - a
10 - a
11 - b
12 - a
13 - b 
14 - a
15 - a
16 - a
17 - b
18 - a
19 - a
20 - a
21 - b
22 - a
24 - a
25 - a
26 - b
Winner: a*

Round I - 18 candidates

Candidate: 0 2 3 5 6 7 9 10 12 14 15 16 18 19 20 22 24 25 
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    a_aa_aaa_aa_a_aaa_aaa_a_aa_
Candidate Substring:
0 - ab
2 - aa
3 - ab
5 - aa
6 - aa
7 - ab
9 - aa
10 - ab
12 - ab 
14 - aa
15 - aa
16 - ab
18 - aa
19 - aa
20 - ab
22 - ab
24 - aa
25 - ab
Winner: aa*

Round II - 9 candidates

Candidate: 2 5 6 9 14 15 18 19 24 
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    __aa_aa__a____aa__aa____a__
Candidate Substring:
2 - aab
5 - aaa
6 - aab
9 - aab
14 - aaa
15 - aab
18 - aaa
19 - aab
24 - aab
Winner: aaa*

Round III - 3 candidates

Candidate: 5 14 18 
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    _____a________a___a________
Candidate Substring:
5 - aaab
14 - aaab
18 - aaab
Winner: aaab*

Round IV - 3 candidates

Candidate: 5 14 18 
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    _____a________a___a________
Candidate Substring:
5 - aaaba
14 - aaaba
18 - aaaba
Winner: aaaba*

Round V - 3 candidates

Candidate: 5 14 18 
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    _____a________a___a________
Candidate Substring:
5 - aaabaa
14 - aaabaa
18 - aaabab
Winner: aaabaa*

Round VI - 2 candidates

Candidate: 5 14
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    _____a________a____________
Candidate Substring:
5 - aaabaab
14 - aaabaaa
Winner: aaabaaa*

Champion

Winner:   14 - VI
Number:   012345678901234567890123456
String:   abaabaaabaababaaabaaababaab
Remain:   ______________a____________
Champion: aaabaaa*

5) Notice

After all eliminations, if there are more than one candidate, select the one with lowest index

6) Complexity

If the string is $$$k$$$-repeated strings (each substring size $$$d$$$) or similar then there will be atleast $$$k$$$ candidate after $$$d$$$ eliminations. About $$$O(k \times d)$$$ in the best cases
Best case to say, like a string where all characters are unique, then it will be Linear $$$O(n)$$$
Worst case to say, like a string where all characters are the same, then it will be $$$O(n \times n) = O(n^2)$$$

7) Implementations

For convention, let not use obviously-not-optimized code

Detail Substring Based Elimination - O(n^2) time - O(n) auxiliary space

#include <algorithm>
#include <iostream>
#include <vector>

using namespace std;

/// Finding minimal string right-rotation to make string minimal lexicographical
int min_cyc(string s)
{
    int n = s.size(); /// For convention: n = |s|
    s += s;

    /// For convention
    char mx = *max_element(s.begin(), s.end());
    char mn = *min_element(s.begin(), s.end());

    /// Candidate list
    vector<int> a;

    /// Pre-elimination
    for (int i = 0; i < n; ++i)
        if (s[i] == mn)
            a.push_back(i);

    /// Doing dth elimination
    for (int d = 1; d < n; ++d)
    {
        /// Minimal character among substrings
        char c = mx;
        for (int x : a) c = min(c, s[x + d]);

        /// New candidate list
        vector<int> b;

        /// Elimination
        for (int x : a)
            if (s[x + d] == c)
                b.push_back(x);

        swap(a, b);
        if (a.size() == 1) break; /// Found final candidate
    }

    return a.front(); /// The one with smallest index
}

int main()
{
    /// Input
    string s;
    cin >> s;

    /// Output
    cout << min_cyc(s);
    return 0;
}

Noncomment Substring Based Elimination - O(n^2) time - O(n) auxiliary space

#include <algorithm>
#include <iostream>
#include <vector>

using namespace std;

int min_cyc(string s)
{
    int n = s.size();
    s += s;

    char mx = *max_element(s.begin(), s.end());
    char mn = *min_element(s.begin(), s.end());
    vector<int> a;
    for (int i = 0; i < n; ++i)
        if (s[i] == mn)
            a.push_back(i);

    for (int d = 1; d < n; ++d)
    {
        char c = mx;
        for (int x : a) c = min(c, s[x + d]);

        vector<int> b;
        for (int x : a)
            if (s[x + d] == c)
                b.push_back(x);

        swap(a, b);
        if (a.size() == 1) break;
    }

    return a.front();
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

C) Elimination and Colliding

1) Idea:

This is the improvement from Substring Based Elimination. It takes me few hours to think and to implement the idea
If two substrings colide then one will be eliminate

2) Algorithm:

At the $$$dth$$$ elimination
Let $$$s[p] = s[p\ mod\ |s|]$$$ for convention
Let $$$l, r$$$ are the candidates and $$$l < r$$$
Let $$$A$$$ is a circular substring of $$$s$$$ and $$$A = s[l \dots l + d - 1]$$$
Let $$$B$$$ is a circular substring of $$$s$$$ and $$$B = s[r \dots r + d - 1]$$$
If $$$A$$$ and $$$B$$$ collide ($$$l + d - 1 = r$$$), we will select $$$l$$$ and eliminate $$$r$$$ (since $$$l < r$$$)

3) Proof:

Let $$$A_1, A_2, \dots, A_k$$$ are consecutive circular substring of $$$s$$$ that satisfy

$$$\Sigma(|A_i|) = |S|$$$

$$$A_i = s[l_1, r_i]$$$

$$$l_{i+1} \equiv r_i + 1$$$ (mod $$$n$$$)

No loss of generality. Let $$$A_1$$$ and $$$A_2$$$ are the substring we are comparing, for such $$$l1, l2$$$ are the candidates, then we also have $$$l_1 = min(l_i)$$$
In lexicographical order (in comparing the strings) to say

If $$$A_1 < A_2$$$ then $$$l_2$$$ will be eliminated
If $$$A_1 > A_2$$$ then $$$l_1$$$ will not be the next candidate
If $$$A_1 = A_2 = \dots = A_k$$$ then $$$min(l_i)$$$ will be the champion. Hence $$$l_1$$$ is
Else there is such $$$p$$$ that $$$A_1 = A_2 = \dots = A_p \neq A_{p+1}$$$

There is only the case $$$A_p < A_{p+1} \Rightarrow A_1A_2\dots A_p < A_2A_3 \dots A_{p+1}$$$, hence $$$l_2$$$ will be eliminated

Because for this case $$$A_1 = A_2 = A_p > A_{p+1}$$$ — the contradiction happened where $$$l_1, l_2, \dots, l_p$$$ are all candidates (Since $$$A_1 = A_2 = \dots = A_p$$$), but $$$A_{p+1}$$$ is smaller ($$$l_{p+1}$$$ should be the candidate instead

Let $$$p_1, p_2, \dots, p_k$$$ the candidates and $$$A_1, A_2, \dots, A_k$$$ the candidate circular substrings of $$$s$$$ that start from those position

$$$S = aabaaa\dots$$$ and $$$A_1 = aab$$$, $$$A_2 = aaa$$$ then $$$A_1$$$ is eliminated, $$$p_2 = 3$$$ the next candidate
$$$S = aabaac\dots$$$ and $$$A_1 = aab$$$, $$$A_2 = aac$$$ then $$$A_2$$$ is eliminated, $$$p_1 = 0$$$ the next candidate
$$$S = aabaabaab\dots aab$$$ and $$$A_1 = A_2 = \dots = A_k = aab$$$, then $$$p_1 = 0$$$ the champion
$$$S = aabaabaab\dots aabaac\dots$$$ and $$$A_1 = A_2 = \dots = A_p = aab \neq A_{p+1} = aac$$$, then $$$p_1 = 0$$$ the next candidate
$$$S = aabaabaab\dots aabaaa\dots$$$ and $$$A_1 = A_2 = \dots = A_p = aab \neq A_{p+1} = aaa$$$, then contradiction happened since $$$aaa$$$ should be the candidate instead

4) Example:

$$$S = $$$"$$$abaabaaabaababaaabaaababaab$$$"

Pre-elimination - 26 candidates

Candidate: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    abaabaaabaababaaabaaababaab
Minimal Elimination: a*  -> 0 2 3 5 6 7 9 10 12 14 15 16 18 19 20 22 24 25 
Collide Elimination: aa* -> 0 2 5 7 9 12 14 16 18 20 22 24

Round I - 18 candidates

Candidate: 0 2 5 7 9 12 14 16 18 20 22 24 
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    a_a__a___a__a_a___a___a_a__
Minimal Elimination: aa*   -> 2 5 9 14 18 24
Collide Elimination: aaaa* -> 2 5 9 14 18 24

Round II - 6 candidates

Candidate: 2 5 9 14 18 24 
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    __a__a___a____a___a_____a__
Minimal Elimination: aaa*    -> 5 14 18 
Collide Elimination: aaaaaa* -> 5 14 18

Round III - 3 candidates

Candidate: 5 14 18
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    _____a________a___a________
Minimal Elimination: aaab*     -> 5 14 18 
Collide Elimination: aaabaaab* -> 5 14

Round IV - 2 candidates

Candidate: 5 14
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    _____a________a____________
Minimal Elimination: aaaba*      -> 5 14
Collide Elimination: aaabaaaaba* -> 5 14

Round V - 2 candidates

Candidate: 5 14
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    _____a________a____________
Minimal Elimination: aaabaa*       -> 5 14
Collide Elimination: aaabaaaaabaa* -> 5 14

Round VI - 2 candidates

Candidate: 5 14
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    _____a________a____________
Minimal Elimination: aaabaaa*        -> 14
Collide Elimination: aaabaaaaaabaaa* -> 14

Champion

Winner:   14 - VI
Number:   012345678901234567890123456
String:   abaabaaabaababaaabaaababaab
Remain:   ______________a____________
Champion: aaabaaa*

5) Notice:

Collision Detecting is on Circular Substring

6) Complexity:

If the string continues to grown, then they will collide and dies
Trying to extend string size $$$k$$$ mean minimize candidate list size $$$O(k \times \lfloor \frac{n}{k} \rfloor) = O(n)$$$
Trying to maximize candidate list size $$$k$$$ mean reduce the string size $$$O(f(x)) = O(k \times (\lfloor \frac{n}{k} \rfloor \times k - k)) + O(f(\lfloor \frac{k}{2} \rfloor)) = O(k\ log\ k)$$$

7) Implementations:

For convention, let not use obviously-not-optimized code

Detail Elimination and Colliding - O(n log n) time - O(n) auxiliary space

#include <algorithm>
#include <iostream>
#include <deque>

using namespace std;

/// Finding minimal string right-rotation to make string minimal lexicographical
int min_cyc(string s)
{
    int n = s.size(); /// For convention: n = |s|
    s += s;

    /// For convention
    char mx = *max_element(s.begin(), s.end());
    char mn = *min_element(s.begin(), s.end());

    /// Candidate list
    deque<int> a;

    /// Pre-elimination
    for (int i = 0; i < n; ++i) if (s[i] == mn)
        if (a.empty() || a.back() != i - 1)
            a.push_back(i);

    /// Doing dth elimination
    for (int d = 1; d < n; ++d)
    {
        /// Minimal character among substrings
        char c = mx;
        for (int x : a) c = min(c, s[x + d]);

        /// New candidate list
        deque<int> b; 

        /// Elimination
        for (int x : a) if (s[x + d] == c)
            if (b.empty() || b.back() != x - d - 1)
                b.push_back(x);

        swap(a, b);
        if (a.size() == 1) break; /// Found final candidate
    }

    return a.front(); /// The one with smallest index
}

int main()
{
    /// Input
    string s;
    cin >> s;

    /// Output
    cout << min_cyc(s);
    return 0;
}

Elimination and Colliding - O(n log n) time - O(n) auxiliary space

#include <algorithm>
#include <iostream>
#include <deque>

using namespace std;

int min_cyc(string s)
{
    int n = s.size();
    s += s;

    char mx = *max_element(s.begin(), s.end());
    char mn = *min_element(s.begin(), s.end());
    deque<int> a;
    for (int i = 0; i < n; ++i) if (s[i] == mn)
        if (a.empty() || a.back() != i - 1)
            a.push_back(i);

    for (int d = 1; d < n; ++d)
    {
        char c = mx;
        for (int x : a) c = min(c, s[x + d]);

        deque<int> b; 
        for (int x : a) if (s[x + d] == c)
            if (b.empty() || b.back() != x - d - 1)
                b.push_back(x);

        swap(a, b);
        if (a.size() == 1) break;
    }

    return a.front();
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

D) Elimination and Merging

1) Idea:

This is the improvement from Elimination and Colliding. It takes me few days to think, to prove and to implement the idea
If many substrings collide, we can eliminate all as once. And jump directly to the1
From the above proof, we also have the property that when $$$p$$$ is one of the candidates then $$$repeated-string$$$ start from $$$p$$$ might be one of the candidate

2) Algorithm:

At the $$$dth$$$ elimination, let $$$p_1, p_2, \dots, p_k$$$ are the candidates, $$$A_1, A_2, \dots, A_k$$$ are circular substring of $$$s$$$ that $$$A_1 = s[p_1 \dots p_1 + d - 1]$$$. Notice that all of them are either collide ($$$p_i + d - 1 \equiv p_{i+1}$$$ (mod $$$n$$$)) or not intersect (because the intersect pairs are eliminated at their first collision)
Let $$$v$$$ is the maximum collision or/and exist such consecutive candidates $$$q_1, q_2, \dots, q_v$$$ that $$$A_{q_1}$$$ collides $$$A_{q_2}$$$ collides $$$\dots$$$ collides $$$A_{q_v}$$$. Then $$$A_{q_1}A_{q_2} \dots A_{q_v}$$$ is the best candidates up to the $$$(d \times v)$$$-th elimination.
Therefore, instead of eliminating all the weaker candidates, we can just merge them and teleport to the next elimination that they collides again

3) Proofs:

I think you can prove this based on Elimination and Colliding and Duval Lyndon Approach.
Good thing is that this will also prove the linear complexity of the algorithm

4) Example:

$$$S = $$$"$$$abaabaaabaababaaabaaababaab$$$"

Pre-elimination - 26 candidates

Candidate: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    abaabaaabaababaaabaaababaab
Minimal Elimination: a*           -> 0 2 3 5 6 7 9 10 12 14 15 16 18 19 20 22 24 25 
Merging Elimination: aaa* = (a)3* -> 5 14 18
Teleport Tournament: III

Round III - 3 candidates

Candidate: 5 14 18
Number:    012345678901234567890123456
String:    abaabaaabaababaaabaaababaab
Remain:    _____a________a___a________
Minimal Elimination: aaab*                           -> 5 14 18 
Merging Elimination: aaabaaab = (aaab)2* = ((a)3b)2* -> 14
Teleport Tournament: VIII

Champion

Winner:   14 - VIII
Number:   012345678901234567890123456
String:   abaabaaabaababaaabaaababaab
Remain:   ______________a____________
Champion: aaabaaab*

5) Notice:

If the candidates arent changed, teleport to next tournament not itself
The last candidate might merge with the first candidate
When merging, you should use modulo operator for safety
Because of its complex code, you can just ignore pre-elimination and play round $$$d = 1$$$ to $$$n$$$

6) Complexity:

It seem to be about $$$O(n\ log\ n)$$$ when each time the candidates is eliminate atleast a half
But actually the complexity is Linear $$$O(n)$$$ because every position is called maximumly once for preprocessing, twice times for comparings, thrice for merging only.

7) Implementations:

Detail Elimination and Merging - O(n) time - O(n) auxiliary space


#include <iostream>
#include <vector>

using namespace std;

/// Finding minimal string right-rotation to make string minimal lexicographical
int min_cyc(string s)
{
    /// For conventions:
    char mx = 0;
    for (char c : s)
    {
        mx = max(mx, c);
    }

    int n = s.size();
    s += s;

    /// Candidate list
    vector<int> a;
    for (int p = 0; p < n; ++p)
        a.push_back(p);

    /// dth elimination
    for (int d = 1; d <= n; ++d)
    {
        /// Minimal character among substrings
        char mn = mx;
        for (int p : a)
        {
            mn = min(mn, s[p + d - 1]);
        } 
        
        /// Longest Duplicated Minimal |d|-length Substring
        int k = 1;
        vector<int> v, f;
        for (int p : a)
        {
            /// This is not a minimal substring
            if (mn == s[p + d - 1])
            {
                /// Elimination
                if (v.empty() || (v.back() + 1LL * f.back() * d) % n != p) /// Maybe next candidate
                {
                    v.push_back(p);
                    f.push_back(1);
                }
                else 
                {
                    k = max(k, ++f.back()); /// Merging Case
                }
            }
        }

        /// Merge the last with the front if possible
        if ((v.back() + 1LL * f.back() * d) % n == v.front())
        {
            k = max(k, f.back() += f.front());
        }

        /// Next candidate list
        a.clear();
        for (int i = 0; i < v.size(); ++i)
        {
            if (f[i] == k) /// Ignore less repeated substrings since they arent optimal 
            {
                a.push_back(v[i]);
            }
        }

        /// Junp to the next tournament
        d = d * k;

        /// Found the winner
        if (a.size() == 1) break;
    }

    /// Return the winner
    return a.front();
}

int main()
{
    /// Input
    string s;
    cin >> s;

    /// Output
    cout << min_cyc(s);
    return 0;
}

Noncomment Elimination and Merging - O(n) time - O(n) auxiliary space


#include <iostream>
#include <vector>

using namespace std;

int min_cyc(string s)
{
    char mx = 0;
    for (char c : s)
    {
        mx = max(mx, c);
    }

    int n = s.size();
    s += s;

    vector<int> a;
    for (int p = 0; p < n; ++p)
        a.push_back(p);

    for (int d = 1; d <= n; ++d)
    {
        char mn = mx;
        for (int p : a)
        {
            mn = min(mn, s[p + d - 1]);
        } 
        
        int k = 1;
        vector<int> v, f;
        for (int p : a)
        {
            if (mn != s[p + d - 1]) continue;
            if (v.empty() || (v.back() + 1LL * f.back() * d) % n != p)
            {
                v.push_back(p);
                f.push_back(1);
            }
            else 
            {
                k = max(k, ++f.back());
            }

        }

        if ((v.back() + 1LL * f.back() * d) % n == v.front())
            k = max(k, f.back() += f.front());

        d = d * k;

        a.clear();
        for (int i = 0; i < v.size(); ++i)
        {
            if (f[i] == k)
            {
                a.push_back(v[i]);
            }
        }

        if (a.size() == 1) break;
    }

    return a.front();
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

8) Optimization

Since every position $$$p$$$, will in range $$$0 \dots 2 \times n - 1$$$, hence we can precalculate the modulo of each

Optimized Elimination and Merging - O(n) time - O(n) auxiliary space

#include <algorithm>
#include <iostream>
#include <vector>

using namespace std;

#define all(x) (x).begin(), (x).end()
int min_cyc(const string &s)
{
    char mx = *max_element(all(s));
    int n = s.size();

    vector<int> a(n);
    for (int i = 0; i < n; ++i) 
        a[i] = c[i] = c[i + n] = i;
    
    for (int d = 1, k = 1; d <= n; d = k + 1)
    {
        char mn = mx;
        for (int p : a) mn = min(mn, s[c[p + d - 1]]);
        
        vector<int> v;
        for (int p : a) if (mn == s[c[p + d - 1]])
        {
            if (v.empty() || c[v.back() + f[v.back()]] != p)
                v.push_back(p), f[v.back()] = 0;
            
            k = max(k, f[v.back()] += d);
        }

        if (c[v.back() + f[v.back()]] == v.front())
            k = max(k, f[v.back()] += f[v.front()]);

        a.clear();
        for (int p : v) if (f[p] == k) a.push_back(p);
        if (a.size() == 1) return a.front();
    }
}

int main()
{
    string s;
    cin >> s;
    cout << min_cyc(s);
    return 0;
}

Rev.	Кто	Когда	Δ	Комментарий
en3	SPyofgame	2021-04-25 18:20:06	1111
en2	SPyofgame	2021-04-25 17:03:05	114
en1	SPyofgame	2021-04-25 17:01:45	134956	Initial revision (published)