CSES String Section Editorial

#	User	Rating
1	tourist	3856
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3462
9	ecnerwala	3451
10	heuristica	3431

#	User	Contrib.
1	cry	167
2	-is-this-fft-	162
3	Dominater069	160
4	Um_nik	158
5	atcoder_official	157
6	Qingyu	156
7	djm03178	151
7	adamant	151
9	luogu_official	150
10	awoo	147

I've seen CF tutorials for many other sections of CSES but didn't see one for strings, so I thought of writing one. Constructive criticism (or just regular criticism) is always welcome!

Note that there are many ways to do a problem. For instance, linear time string searching can be done with hashing, KMP, Z-Algorithm, Aho-Corasick Automaton, etc. . I used the way that was most intuitive for me, but there might exist other ways that have shorter code.

1. Word Combinations

Solution

Code

#include <bits/stdc++.h>
using namespace std;
 
#define PB push_back
 
const int MOD = 1e9 + 7;
 
string S;
int K, I = 1, DP[5005];
 
vector<int> adj[500005];
 
struct node {
    int fail, ch[26] = {};
    vector<int> lens;
    string s = "";
} T[500005];
 
void insert(string s) {
    int x = 1;
    for (int i = 0; i < s.size(); i++) {
        if (T[x].ch[s[i] - 'a'] == 0)
            T[x].ch[s[i] - 'a'] = ++I;
        T[T[x].ch[s[i] - 'a']].s = T[x].s + s[i];
        x = T[x].ch[s[i] - 'a'];
    }
    T[x].lens.PB(s.size());
}
 
// initializes lens
void dfs(int u) {
    T[u].lens.insert(T[u].lens.end(), T[T[u].fail].lens.begin(), T[T[u].fail].lens.end());
    for (int v : adj[u])
        dfs(v);
}
 
void build() {
    queue<int> Q;
    int x = 1; 
    T[1].fail = 1;
    for (int i = 0; i < 26; i++) {
        if (T[x].ch[i])
            T[T[x].ch[i]].fail = x, Q.push(T[x].ch[i]);
        else 
            T[x].ch[i] = 1;
    }
    while (!Q.empty()) {
        x = Q.front(); Q.pop();
        for (int i = 0; i < 26; i++) {
            if (T[x].ch[i])
                T[T[x].ch[i]].fail = T[T[x].fail].ch[i], Q.push(T[x].ch[i]);
            else 
                T[x].ch[i] = T[T[x].fail].ch[i];
        }
    }
    for (int i = 2; i <= I; i++)
        adj[T[i].fail].PB(i);
    dfs(1);
}
 
int run(string s) {
    DP[0] = 1;
    for (int i = 1, x = 1; i < s.size(); i++) {
        x = T[x].ch[s[i] - 'a'];
        for (int l : T[x].lens)
            if (l <= i)
                DP[i] = (DP[i] + DP[i - l]) % MOD;
    }
    return DP[s.size() - 1];
}
 
int main() {
    cin >> S >> K;
    for (int i = 0; i < K; i++) {
        string s; cin >> s;
        insert(s);
    }
 
    build();
    cout << run(" " + S) << '\n';
}

2. String Matching

Solution

Code

#include <bits/stdc++.h>
using namespace std;
 
string T, P, S;
int Z[2000005];
 
int main() {
    ios_base::sync_with_stdio(0); cin.tie(0);
 
    cin >> T >> P;
    S = P + "$" + T;
 
    int n = (int)S.size();
    for (int i = 1, l = 0, r = 0; i < n; i++) {
        if (i <= r)
            Z[i] = min(Z[i - l], r - i + 1);
        while (i + Z[i] < n && S[Z[i]] == S[i + Z[i]])
            Z[i]++;
        if (i + Z[i] - 1 > r)
            l = i, r = i + Z[i] - 1;
    }
 
    int ans = 0;
    for (int i = P.size() + 1; i < S.size(); i++)
        if (Z[i] == P.size())
            ans++;
    
    cout << ans << '\n';
}

3. Finding Borders

Solution

Code

#include <bits/stdc++.h>
using namespace std;

const ll R = 9973, MOD = 99999989;

string S;
ll H[1000005], P = 1;
 
int main() {
	cin >> S;
	for (int i = 0; i < S.size(); i++) 
		H[i] = ((i == 0 ? 0 : H[i - 1]) * R + (ll)S[i]) % MOD;
 
	for (int k = 1; k < (int)S.size(); k++) {
		P = (P * R) % MOD;
		ll suf = (H[S.size() - 1] - (P * H[S.size() - k - 1]) % MOD + MOD) % MOD;
		if (H[k - 1] == suf)
			cout << k << ' ';
	}
	cout << '\n';
}

4. Finding Periods

Solution

Code

#include <bits/stdc++.h>
using namespace std;
 
string S;
int Z[1000005];
 
int main() {
    cin >> S;
    
    int n = (int)S.size();
    for (int i = 1, l = 0, r = 0; i < n; i++) {
        if (i <= r)
            Z[i] = min(Z[i - l], r - i + 1);
        while (i + Z[i] < n && S[Z[i]] == S[i + Z[i]])
            Z[i]++;
        if (i + Z[i] - 1 > r)
            l = i, r = i + Z[i] - 1;
    }
 
    for (int i = 0; i < n; i++)
        if (i + Z[i] >= n)
            cout << i << ' ';
    cout << n << '\n';
}

5. Minimal Rotation

Solution

Code

def least_rotation(S: str) -> int:
    """Booth's algorithm."""
    S += S  # Concatenate string to it self to avoid modular arithmetic
    f = [-1] * len(S)  # Failure function
    k = 0  # Least rotation of string found so far
    for j in range(1, len(S)):
        sj = S[j]
        i = f[j - k - 1]
        while i != -1 and sj != S[k + i + 1]:
            if sj < S[k + i + 1]:
                k = j - i - 1
            i = f[i]
        if sj != S[k + i + 1]:  # if sj != S[k+i+1], then i == -1
            if sj < S[k]:  # k+i+1 = k
                k = j
            f[j - k] = -1
        else:
            f[j - k] = i + 1
    return k
 
S = input("")
i = least_rotation(S)
print(S[i:] + S[:i])

6. Longest Palindrome

Solution

Code

#include <bits/stdc++.h>
using namespace std;
 
string S;
int d1[1000005], d2[1000005];
 
int main() {
    cin >> S;
 
    int n = S.size();
    for (int i = 0, l = 0, r = -1; i < n; i++) {
        int k = (i > r) ? 1 : min(d1[l + r - i], r - i + 1);
        while (0 <= i - k && i + k < n && S[i - k] == S[i + k]) {
            k++;
        }
        d1[i] = k--;
        if (i + k > r) {
            l = i - k;
            r = i + k;
        }
    }
 
    for (int i = 0, l = 0, r = -1; i < n; i++) {
        int k = (i > r) ? 0 : min(d2[l + r - i + 1], r - i + 1);
        while (0 <= i - k - 1 && i + k < n && S[i - k - 1] == S[i + k]) {
            k++;
        }
        d2[i] = k--;
        if (i + k > r) {
            l = i - k - 1;
            r = i + k ;
        }
    }
 
    int ans = 0, ind = -1;
    for (int i = 0; i < n; i++) {
        if (d1[i] * 2 - 1 > ans) 
            ans = d1[i] * 2 - 1, ind = i;
        if (d2[i] * 2 > ans)
            ans = d2[i] * 2, ind = i;
    }
    if (ans % 2 == 1)
        cout << S.substr(ind - ans / 2, ans);
    else 
        cout << S.substr(ind - ans / 2, ans);
}

7. Required Substring

Solution

Code

#include <bits/stdc++.h>
#include <iostream>
using namespace std;
 
typedef long long ll;
const int mxN = 1001;
const int mxM = 101;
const int mxK = 26;
const int mod = 1e9+7;
 
int kmp[mxN];
 
void calc(string &s)
{
    kmp[0] = 0;
    for(int i = 1; i < (int)s.size(); i++)
    {
        int trymatch = kmp[i-1];
        while(true)
        {
            if(s[trymatch] == s[i])
            {
                kmp[i] = trymatch+1;
                break;
            }
            else if(trymatch)
            {
                trymatch = kmp[trymatch-1];
            }
            else
            {
                kmp[i] = 0;
                break;
            }
        }
    }
}
 
ll dp[mxN][mxM];
bool done[mxN][mxM];
 
ll solve(int i, int n, int j, string &s)
{
    if(done[i][j]) return dp[i][j];
    done[i][j] = true;
    if(i == n) return dp[i][j] = (j==(int)s.size()?1:0);
    if(j == (int)s.size())
    {
        return dp[i][j] = (mxK*solve(i+1,n,j,s))%mod;
    }
    ll ans = 0;
    int t;
    for(int k = 0; k < mxK; k++)
    {
        t = j;
        while(true)
        {
            if(k == s[t]-'A')
            {
                t++;
                break;
            }
            else if(t)
            {
                t = kmp[t-1];
            }
            else break;
        }
        ans += solve(i+1,n,t,s);
        ans %= mod;
    }
    return dp[i][j] = ans;
}
 
int main()
{
    ios_base::sync_with_stdio(0);
    cin.tie(0);
    int n; cin >> n;
    string s; cin >> s;
    calc(s);
    cout << solve(0, n, 0, s);
}

Solution and Code from toniskrijelj

8. Palindrome Queries

Solution

Code

#include <bits/stdc++.h>
using namespace std;
 
using ll = long long;
const ll HASH =  257, MOD = 1e9 + 7;
 
int N, Q;
ll powH[200005] = {1};
 
namespace forward_hash {
    ll T[400005];
 
    void update(int i, ll v) {
        for (T[i += N] = v; i > 1; i >>= 1)
            T[i >> 1] = (T[i] + T[i ^ 1]) % MOD;
    }
 
    ll query(int l, int r) {
        ll res = 0;
        for (l += N, r += N + 1; l < r; l >>= 1, r >>= 1) {
            if (l & 1) res = (res + T[l++]) % MOD;
            if (r & 1) res = (res + T[--r]) % MOD;
        }
        return res;
    }
}
 
namespace backward_hash {
    ll T[400005];
 
    void update(int i, ll v) {
        for (T[i += N] = v; i > 1; i >>= 1)
            T[i >> 1] = (T[i] + T[i ^ 1]) % MOD;
    }
 
    ll query(int l, int r) {
        ll res = 0;
        for (l += N, r += N + 1; l < r; l >>= 1, r >>= 1) {
            if (l & 1) res = (res + T[l++]) % MOD;
            if (r & 1) res = (res + T[--r]) % MOD;
        }
        return res;
    }
}
 
int main() {
    cin >> N >> Q;
    for (int i = 1; i < N; i++) {
        powH[i] = (powH[i - 1] * HASH) % MOD;
        //cout << powH[i] << " \n"[i == N - 1];
    }
    for (int i = 0; i < N; i++) {
        char c; cin >> c;
        forward_hash::update(i, powH[i] * (ll)c % MOD);
        backward_hash::update(i, powH[N - i - 1] * (ll)c % MOD);
    }
 
    while (Q--) {
        int t; cin >> t;
        if (t == 1) {
            int k; char c;
            cin >> k >> c;
            k--;
            forward_hash::update(k, powH[k] * (ll)c % MOD);
            backward_hash::update(k, powH[N - k - 1] * (ll)c % MOD);
        }
        else if (t == 2) {
            int l, r;
            cin >> l >> r;
            l--, r--;
            ll forward = forward_hash::query(l, r);
            forward = (forward * powH[N - 1 - r]) % MOD;
            ll backward = backward_hash::query(l, r);
            backward = (backward * powH[l]) % MOD;
            if (forward == backward)
                cout << "YES\n";
            else
                cout << "NO\n";
        }
    }
}

9. Finding Patterns

Solution

Code

#include <iostream>
#include <algorithm>
#include <queue>
using namespace std;
 
#define PB push_back
 
string S;
int K, I = 1, ans[500005];
 
vector<int> adj[500005];
 
struct node {
    int fail, ch[26] = {}, cnt = 0;
    vector<int> word;
} T[500005];
 
void insert(string s, int i) {
    int x = 1;
    for (int i = 0; i < s.size(); i++) {
        if (T[x].ch[s[i] - 'a'] == 0)
            T[x].ch[s[i] - 'a'] = ++I;
        x = T[x].ch[s[i] - 'a'];
    }
    T[x].word.PB(i);
}
 
void build() {
    queue<int> Q;
    int x = 1; 
    T[1].fail = 1;
    for (int i = 0; i < 26; i++) {
        if (T[x].ch[i])
            T[T[x].ch[i]].fail = x, Q.push(T[x].ch[i]);
        else 
            T[x].ch[i] = 1;
    }
    while (!Q.empty()) {
        x = Q.front(); Q.pop();
        for (int i = 0; i < 26; i++) {
            if (T[x].ch[i])
                T[T[x].ch[i]].fail = T[T[x].fail].ch[i], Q.push(T[x].ch[i]);
            else 
                T[x].ch[i] = T[T[x].fail].ch[i];
        }
    }
    for (int i = 2; i <= I; i++)
        adj[T[i].fail].PB(i);
}
 
void run(string s) {
    for (int i = 0, x = 1; i < s.size(); i++) {
        x = T[x].ch[s[i] - 'a'];
        T[x].cnt++;
    }
}
 
int dfs(int u) {
    int res = T[u].cnt;
    for (int v : adj[u])
        res += dfs(v);
    for (int w : T[u].word)
        ans[w] = res;
    return res;
}
 
int main() {
    ios_base::sync_with_stdio(0); cin.tie(0);
    
    cin >> S >> K;
    for (int i = 0; i < K; i++) {
        string s; cin >> s;
        insert(s, i);
    }
 
    build();
    run(S);
    dfs(1);
 
    for (int i = 0; i < K; i++)
        cout << (ans[i] ? "YES" : "NO") << '\n';
}

10. Counting Patterns

Solution

Code

#include <iostream>
#include <algorithm>
#include <cassert>
#include <queue>
using namespace std;
 
#define PB push_back
 
string S;
int K, I = 1, ans[500005];
 
vector<int> adj[500005];
 
struct node {
    int fail, ch[26] = {}, cnt = 0;
    vector<int> word;
    string s = "";
} T[500005];
 
void insert(string s, int i) {
    int x = 1;
    for (int i = 0; i < s.size(); i++) {
        if (T[x].ch[s[i] &mdash; 'a'] == 0)
            T[x].ch[s[i] &mdash; 'a'] = ++I;
        T[T[x].ch[s[i] &mdash; 'a']].s = T[x].s + s[i];
        x = T[x].ch[s[i] &mdash; 'a'];
    }
    T[x].word.PB(i);
}
 
void build() {
    queue<int> Q;
    int x = 1; 
    T[1].fail = 1;
    for (int i = 0; i < 26; i++) {
        if (T[x].ch[i])
            T[T[x].ch[i]].fail = x, Q.push(T[x].ch[i]);
        else 
            T[x].ch[i] = 1;
    }
    while (!Q.empty()) {
        x = Q.front(); Q.pop();
        for (int i = 0; i < 26; i++) {
            if (T[x].ch[i])
                T[T[x].ch[i]].fail = T[T[x].fail].ch[i], Q.push(T[x].ch[i]);
            else 
                T[x].ch[i] = T[T[x].fail].ch[i];
        }
    }
    for (int i = 2; i <= I; i++)
        adj[T[i].fail].PB(i);
}
 
void run(string s) {
    for (int i = 0, x = 1; i < s.size(); i++) {
        x = T[x].ch[s[i] &mdash; 'a'];
        // stuff depending on problem
        T[x].cnt++;
    }
}
 
int dfs(int u) {
    int res = T[u].cnt;
    for (int v : adj[u])
        res += dfs(v);
    for (int w : T[u].word)
        ans[w] = res;
    return res;
}
 
int main() {
    cin >> S >> K;
    for (int i = 0; i < K; i++) {
        string s; cin >> s;
        insert(s, i);
    }
 
    build();
    run(S);
    dfs(1);
 
    for (int i = 0; i < K; i++)
        cout << ans[i] << '\n';
}

11. Pattern Positions

Solution

Code

#include <iostream>
#include <algorithm>
#include <vector>
#include <unordered_map>
#include <ext/pb_ds/assoc_container.hpp>
using namespace std;
using namespace __gnu_pbds;
 
#define PB push_back
#define EB emplace_back
 
template<class K, class V> 
using fast_map = gp_hash_table<K, V>;
 
struct node {
    fast_map<char, int> adj, adv;
    vector<int> w;
    int p = -1, suf = -1, ext = -1; 
    char c;
 
    node(int p = -1, char c = '$') : p(p), c(c) {}
};
 
vector<node> T(1);
 
void insert(string s, int id) {
    int t = 0;
    for (char c : s) {
        if (T[t].adj.find(c) == T[t].adj.end()) {
            T[t].adj[c] = T.size();
            T.EB(t, c);
        }
        t = T[t].adj[c];
    }
    T[t].w.PB(id);
}
 
int advance(int t, char c);
 
int suffix(int t) {
    if (T[t].suf == -1) {
        if (t == 0 || T[t].p == 0)
            T[t].suf = 0;
        else {
            T[t].suf = advance(suffix(T[t].p), T[t].c);
        }
    }
    return T[t].suf;
}
 
int advance(int t, char c) {
    if (T[t].adv.find(c) == T[t].adv.end()) {
        if (T[t].adj.find(c) != T[t].adj.end())
            T[t].adv[c] = T[t].adj[c];
        else
            T[t].adv[c] = t == 0 ? 0 : advance(suffix(t), c);
    }
    return T[t].adv[c];
}
 
int K, ans[500005];
string S, W[500005];
 
int jump(int t) {
    if (T[t].ext == -1) {
        int tt = t;
        do {
            tt = suffix(tt);
        } while (T[tt].w.empty() && tt != 0);
        T[t].ext = tt;
    }
    return T[t].ext;
}
 
void answer(int t, int cnt) {
    while (t != 0) {
        for (int u : T[t].w)
            ans[u] = cnt;
        T[t].w.clear();
        t = jump(t);
    }
}
 
 
int main() {
    ios_base::sync_with_stdio(0); cin.tie(0);
    
    getline(cin, S);
    cin >> K; 
    string ddd; getline(cin, ddd);
    for (int i = 0; i < K; i++) {
        getline(cin, W[i]);
        insert(W[i], i);
    }
 
 
    int t = 0, cnt = 1;
    for (char c : S) {
        t = advance(t, c);
        answer(t, cnt++);
    }
 
    for (int i = 0; i < K; i++)
        cout << (ans[i] ? to_string(ans[i] - W[i].size() + 1) : "-1") << '\n';
}

12. Distinct Substring

Solution

Code

#include <bits/stdc++.h>
using namespace std;

struct SuffixAuto {
    struct State {
        int len, link;
        int next[26];
        State(int _len = 0, int _link = -1) : len(_len), link(_link) {
            memset(next, -1, sizeof(next));
        }
    };
 
    vector<State> states;
    int last;
 
    SuffixAuto() {}
 
    SuffixAuto(const string &S) {
        init(S);
    }
 
    inline int state(int len = 0, int link = -1) {
        states.emplace_back(len, link);
        return states.size() - 1;
    }
 
    void init(const string &S) {
        states.reserve(2 * S.size());
        last = state();
        for (char c : S)
            extend(c);
    }
 
    void extend(char _c) {
        int c = _c - 'a'; // change for different alphabet
        int cur = state(states[last].len + 1), P = last; 
        while (P != -1 && states[P].next[c] == -1) {
            states[P].next[c] = cur;
            P = states[P].link;
        }
        if (P == -1)
            states[cur].link = 0;
        else {
            int Q = states[P].next[c];
            if (states[P].len + 1 == states[Q].len)
                states[cur].link = Q;
            else {
                int C = state(states[P].len + 1, states[Q].link);
                copy(states[Q].next, states[Q].next + 26, states[C].next);
                while (P != -1 && states[P].next[c] == Q) {
                    states[P].next[c] = C;
                    P = states[P].link;
                }
                states[Q].link = states[cur].link = C;
            }
        }
        last = cur;
    }
};
 
string S;
SuffixAuto sa;
ll dp[200005];
 
ll dfs(int u) {
    if (dp[u] > 0)
        return dp[u];
    for (int i = 0; i < 26; i++)
        if (sa.states[u].next[i] != -1) 
            dp[u] += dfs(sa.states[u].next[i]);
    return dp[u] += 1;
}
 
 
int main() {
    ios_base::sync_with_stdio(0); cin.tie(0);
 
    cin >> S;
    sa.init(S);
    cout << dfs(0) - 1 << '\n';
}

13. Repeating Substring

Solution

Code

#include <bits/stdc++.h>
using namespace std;
 
vector<int> buildSuffixArray(const string& S) {
    int n = S.size();
    const int alphabet = 256;
    vector<int> p(n), c(n), cnt(max(alphabet, n), 0);
    for (int i = 0; i < n; i++)
        cnt[S[i]]++;
    for (int i = 1; i < alphabet; i++)
        cnt[i] += cnt[i-1];
    for (int i = 0; i < n; i++)
        p[--cnt[S[i]]] = i;
    c[p[0]] = 0;
    int classes = 1;
    for (int i = 1; i < n; i++) {
        if (S[p[i]] != S[p[i-1]])
            classes++;
        c[p[i]] = classes - 1;
    }
    vector<int> pn(n), cn(n);
    for (int h = 0; (1 << h) < n; ++h) {
        for (int i = 0; i < n; i++) {
            pn[i] = p[i] - (1 << h);
            if (pn[i] < 0)
                pn[i] += n;
        }
        fill(cnt.begin(), cnt.begin() + classes, 0);
        for (int i = 0; i < n; i++)
            cnt[c[pn[i]]]++;
        for (int i = 1; i < classes; i++)
            cnt[i] += cnt[i-1];
        for (int i = n-1; i >= 0; i--)
            p[--cnt[c[pn[i]]]] = pn[i];
        cn[p[0]] = 0;
        classes = 1;
        for (int i = 1; i < n; i++) {
            pair<int, int> cur = {c[p[i]], c[(p[i] + (1 << h)) % n]};
            pair<int, int> prev = {c[p[i-1]], c[(p[i-1] + (1 << h)) % n]};
            if (cur != prev)
                ++classes;
            cn[p[i]] = classes - 1;
        }
        c.swap(cn);
    }
    return p;
}
 
vector<int> buildLCP(const string& S, const vector<int>& P) {
    int N = S.size();
    vector<int> R (N, 0);
    for (int i = 0; i < N; i++)
        R[P[i]] = i;
    
    int k = 0;
    vector<int> lcp(N - 1, 0);
    for (int i = 0; i < N; i++) {
        if (R[i] == N - 1) {
            k = 0;
            continue;
        }
        int j = P[R[i] + 1];
        while (i + k < N && j + k < N && S[i + k] == S[j + k])
            k++;
        lcp[R[i]] = k;
        if (k) k--;
    }
    return lcp;
}
 
string S;
 
int main() {
    ios_base::sync_with_stdio(0); cin.tie(0);
 
    cin >> S;
    vector<int> suf = buildSuffixArray(S), lcp = buildLCP(S, suf);
    int ans = 0;
    for (int i = 0; i < S.size() - 1; i++)
        ans = max(ans, lcp[i]);
    if (ans == 0) {
        cout << -1 << '\n';
        return 0;
    }
    for (int i = 0; i < S.size() - 1; i++)
        if (ans == lcp[i]) {
            cout << S.substr(suf[i], lcp[i]) << '\n';
            return 0;
        }
}

14. String Functions

Solution

Code

#include <bits/stdc++.h>
using namespace std;

namespace str {
    vector<int> pi(const string &s) {
        int n = (int)s.length();
        vector<int> _pi(n);
        for (int i = 1, j; i < n; i++) {
            for (j = _pi[i - 1]; j > 0 && s[j] != s[i]; j = _pi[j - 1]);
            if (s[i] == s[j])
                j++;
            _pi[i] = j;
        }
        return _pi;
    }
 
    vector<int> z(const string &s) {
        int n = (int)s.size();
        vector<int> _z(n);
        for (int i = 1, l = 0, r = 0; i < n; i++) {
            if (i <= r)
                _z[i] = min(_z[i - l], r - i + 1);
            while (i + _z[i] < n && s[_z[i]] == s[i + _z[i]])
                _z[i]++;
            if (i + _z[i] - 1 > r)
                l = i, r = i + _z[i] - 1;
        }
        return _z;
    }
}

string S;
 
int main() { 
    cin >> S;
    vector<int> z = str::z(S);
    for (int x : z)
        cout << x << ' ';
    cout << '\n';
    vector<int> pi = str::pi(S);
    for (int x : pi)
        cout << x << ' ';
    cout << '\n';
}

15. Substring Order I

Solution

Code

#include <bits/stdc++.h>
using namespace std;

struct SuffixAuto {
    struct State {
        int len, link;
        int next[26];
        State(int _len = 0, int _link = -1) : len(_len), link(_link) {
            memset(next, -1, sizeof(next));
        }
    };
 
    vector<State> states;
    int last;
 
    SuffixAuto() {}
 
    SuffixAuto(const string &S) {
        init(S);
    }
 
    inline int state(int len = 0, int link = -1) {
        states.emplace_back(len, link);
        return states.size() - 1;
    }
 
    void init(const string &S) {
        states.reserve(2 * S.size());
        last = state();
        for (char c : S)
            extend(c);
    }
 
    void extend(char _c) {
        int c = _c - 'a'; // change for different alphabet
        int cur = state(states[last].len + 1), P = last; 
        while (P != -1 && states[P].next[c] == -1) {
            states[P].next[c] = cur;
            P = states[P].link;
        }
        if (P == -1)
            states[cur].link = 0;
        else {
            int Q = states[P].next[c];
            if (states[P].len + 1 == states[Q].len)
                states[cur].link = Q;
            else {
                int C = state(states[P].len + 1, states[Q].link);
                copy(states[Q].next, states[Q].next + 26, states[C].next);
                while (P != -1 && states[P].next[c] == Q) {
                    states[P].next[c] = C;
                    P = states[P].link;
                }
                states[Q].link = states[cur].link = C;
            }
        }
        last = cur;
    }
};
 
string S;
SuffixAuto sa;
ll N, dp[200005];
 
ll dfs_dp(int u = 0) {
    if (dp[u] > 0)
        return dp[u];
    for (int i = 0; i < 26; i++)
        if (sa.states[u].next[i] != -1) 
            dp[u] += dfs_dp(sa.states[u].next[i]);
    return dp[u] += 1;
}
 
string p = "";
 
void dfs(int u = 0) {
    if (N == 0) 
        return;
    else 
        N--;
    for (int i = 0; i < 26; i++) {
        int v = sa.states[u].next[i];
        if (v != -1) {
            if (N >= dp[v]) {
                N -= dp[v];
            }
            else {
                p += (char)(i + 'a');
                return dfs(v);
            }
        }
    }
}
 
int main() {
    ios_base::sync_with_stdio(0); cin.tie(0);
 
    cin >> S >> N;
    sa.init(S);
    dfs_dp();
    dfs();
    cout << p << '\n';
}

16. Substring Order II

Solution

Code

#include <bits/stdc++.h>
using namespace std;

string S;
ll N, cnt[200005], dp[200005];
 
struct SuffixAuto {
    struct State {
        int len, link;
        int next[26];
        State(int _len = 0, int _link = -1) : len(_len), link(_link) {
            memset(next, -1, sizeof(next));
        }
    };
 
    vector<State> states;
    int last;
 
    SuffixAuto() {}
 
    SuffixAuto(const string &S) {
        init(S);
    }
 
    inline int state(int len = 0, int link = -1) {
        states.emplace_back(len, link);
        return states.size() - 1;
    }
 
    void init(const string &S) {
        states.reserve(2 * S.size());
        last = state();
        for (char c : S)
            extend(c);
    }
 
    void extend(char _c) {
        int c = _c - 'a'; // change for different alphabet
        int cur = state(states[last].len + 1), P = last; 
        cnt[cur] = 1;
        while (P != -1 && states[P].next[c] == -1) {
            states[P].next[c] = cur;
            P = states[P].link;
        }
        if (P == -1)
            states[cur].link = 0;
        else {
            int Q = states[P].next[c];
            if (states[P].len + 1 == states[Q].len)
                states[cur].link = Q;
            else {
                int C = state(states[P].len + 1, states[Q].link);
                cnt[C] = 0;
                copy(states[Q].next, states[Q].next + 26, states[C].next);
                while (P != -1 && states[P].next[c] == Q) {
                    states[P].next[c] = C;
                    P = states[P].link;
                }
                states[Q].link = states[cur].link = C;
            }
        }
        last = cur;
    }
 
    void debug(int u = 0, string s = "") {
        cout << "state " << u << " = " << s << " ->";
        for (int i = 0; i < 26; i++) 
            if (states[u].next[i] != -1)
                cout << ' ' << states[u].next[i];
        cout << '\n';
        for (int i = 0; i < 26; i++) 
            if (states[u].next[i] != -1)
                debug(states[u].next[i], s + (char)(i + 'a'));
    }
} sa;
 
ll dfs_dp(int u = 0) {
    if (dp[u] != 0)
        return dp[u];
    for (int i = 0; i < 26; i++) {
        int v = sa.states[u].next[i];
        if (v != -1) 
            dp[u] += dfs_dp(v);
    }
    return dp[u] += cnt[u];
}
 
string p = "";
 
void dfs(int u = 0) {
    if (N < cnt[u])
        return;
    else 
        N -= cnt[u];
    for (int i = 0; i < 26; i++) {
        int v = sa.states[u].next[i];
        if (v != -1) {
            if (N >= dp[v])
                N -= dp[v];
            else {
                p += (char)(i + 'a');
                return dfs(v);
            }
        }
    }
}
 
int main() {
    cin >> S >> N;
    sa.init(S);
 
    vector<pii> v(sa.states.size());
    for (int i = 0; i < sa.states.size(); i++)
        v[i] = {sa.states[i].len, i};
    sort(v.begin(), v.end(), greater<pii>());
    for (auto [len, id] : v)
        if (sa.states[id].link != -1)
            cnt[sa.states[id].link] += cnt[id];
    cnt[0] = 1;
 
    dfs_dp();
 
    dfs();
 
    cout << p << '\n';
}

17. Substring Distribution

Solution

Code

#include <bits/stdc++.h>
using namespace std;

struct SuffixAuto {
    struct State {
        int len, link;
        int next[26];
        State(int _len = 0, int _link = -1) : len(_len), link(_link) {
            memset(next, -1, sizeof(next));
        }
    };
 
    vector<State> states;
    int last;
 
    SuffixAuto() {}
 
    SuffixAuto(const string &S) {
        init(S);
    }
 
    inline int state(int len = 0, int link = -1) {
        states.emplace_back(len, link);
        return states.size() - 1;
    }
 
    void init(const string &S) {
        states.reserve(2 * S.size());
        last = state();
        for (char c : S)
            extend(c);
    }
 
    void extend(char _c) {
        int c = _c - 'a'; // change for different alphabet
        int cur = state(states[last].len + 1), P = last; 
        while (P != -1 && states[P].next[c] == -1) {
            states[P].next[c] = cur;
            P = states[P].link;
        }
        if (P == -1)
            states[cur].link = 0;
        else {
            int Q = states[P].next[c];
            if (states[P].len + 1 == states[Q].len)
                states[cur].link = Q;
            else {
                int C = state(states[P].len + 1, states[Q].link);
                copy(states[Q].next, states[Q].next + 26, states[C].next);
                while (P != -1 && states[P].next[c] == Q) {
                    states[P].next[c] = C;
                    P = states[P].link;
                }
                states[Q].link = states[cur].link = C;
            }
        }
        last = cur;
    }
 
    void debug(int u = 0, string s = "") {
        cout << "state " << u << " = " << s << " ->";
        for (int i = 0; i < 26; i++) 
            if (states[u].next[i] != -1)
                cout << ' ' << states[u].next[i];
        cout << '\n';
        for (int i = 0; i < 26; i++) 
            if (states[u].next[i] != -1)
                debug(states[u].next[i], s + (char)(i + 'a'));
    }
};
 
string S;
SuffixAuto sa;
int lb[200005], ans[100005];
 
int main() {
    cin >> S;
    sa.init(S);
    
    queue<int> q;
    memset(lb, -1, sizeof(lb));
 
    q.push(0);
    lb[0] = 0;
    while (!q.empty()) {
        int u = q.front(); q.pop();
        int d = lb[u];
        for (int i = 0; i < 26; i++) {
            int v = sa.states[u].next[i];
            if (v != -1 && lb[v] == -1) {
                lb[v] = d + 1;
                q.push(v);
            }
        }
    }
 
    for (int i = 1; i < sa.states.size(); i++) 
        ans[lb[i]]++, ans[sa.states[i].len + 1]--;
    
    ans[0] = 0;
    for (int i = 1; i <= S.size(); i++) {
        ans[i] += ans[i - 1];
        cout << ans[i] << ' ';
    }
    cout << '\n';
}

Comments (28)

Show archived | Write comment?

willy108

3 years ago, # |

← Rev. 2 →

+16

I thought I ought to mention that Substring Order 1, Substring Order 2, and Substring Distribution all have a suffix array + lcp solution as well.

Substring Order 1

Consider the solution for counting all the unique substrings, you take the sum of all lcp[i]. In the same manner you can loop over all the unique substrings in lexicographical order. Consider the suffix array, if we look at all substrings that start at the index of the start of the suffix (or sa[i]), the ones that have already be counted are the substrings in the longest common prefix of the current suffix and the suffix before it. This is by the definition of lcp[i]. So you can do pie and get that the amount of unique substrings sa[i] contributes is n — sa[i] — lcp[i] (with some off by 1's of course). A scuffed implementation can be found here.

Substring Order 2

This solution revolves around the fact that the sum of lcp[i] is linear. There are two cases to consider: 1) the answer substring appears more than once in the string, 2) the answer substring does not. For case 2, we can refer to the solution for Substring Order 1, but for case 1, its a little harder. So for some sa[i] we have all the substrings that start at sa[i] and have length <= lcp[i]. Let's try to think about how many times exactly each of these strings appear. Since we have lcp(i, h) >= lcp(i, j) for any i < h < j (and lcp returns the length of the longest common prefix of the two suffixes), we can see that binary search can be done on how many times this substring appears. We start at index sa[i] and binary search on the last time lcp(sa[i], sa[j]) >= the length of the substring we are looking at. What is the complexity of such an algorithm? It is actually still O(nlogn). To avoid double counting, we should only count the contribution of a string if and only if lcp[i] >= lcp[i-1] (since that means its the first time the substring is lexicographically smallest). For the rest of the substrings, we can use the n — sa[i] — lcp[i] trick from Substring Order 1. I am not even going to try and say this implementation is good, but here it is.

Substring Distribution

This problem again relies on the fact that the contribution of unique strings from some suffix sa[i] is n — sa[i] — lcp[i]. So for some given suffix it will contain unique substrings of lengths lcp[i]+1 to sa[i]. Using the psum[l]++, psum[r]-- trick, we have an algorithm with linear complexity with the suffix array construction aside. A scuffed implementation can be found here.

→ Reply

nitin12384

2 years ago, # ^ |

+13

My implementation of your solution of Substring Order II : https://cses.fi/paste/1a1adab9babc8ef84a7c56/
I guess it might be more readable and easy to understand. (I wasn't able to understand your approach, until I did a dry run on your code. So I am putting my implementation which might be easier to understand)

Tspectre

7 months ago, # ^ |

I did a similar thing but I got TLE for the last case. Can anyone help me optimize this code more? (Substring Order II)


#include<bits/stdc++.h>
using namespace std;
#define ll long long
 
void radix_sort(vector<int> &p, vector<int> &c) {
  int n = p.size();
 
  vector<int> cnt(n);
  for (auto x : c) {
    cnt[x]++;
  }
 
  vector<int> p_new(n);
  vector<int> pos(n);
  pos[0] = 0;
  for (int i = 1; i < n; i++) {
    pos[i] = pos[i - 1] + cnt[i - 1];
  }
 
  for (auto x : p) {
    int i = c[x];
    p_new[pos[i]] = x;
    pos[i]++;
  }
  p = p_new;
}
struct ST {
    // 0-base indexing
  int n, logN;
  vector<vector<int>> st;
  vector<int> lg;
 
  void init(const vector<int>& array) {
    n = array.size();
    logN = ceil(log2(n));
    st.resize(logN, vector<int>(n));
    lg.resize(n + 1);
 
    lg[1] = 0;
    for (int i = 2; i <= n; i++)
      lg[i] = lg[i / 2] + 1;
 
    copy(array.begin(), array.end(), st[0].begin());
 
    for (int i = 1; i < logN; i++) {
      for (int j = 0; j + (1 << i) <= n; j++) {
        st[i][j] = min(st[i - 1][j], st[i - 1][j + (1 << (i - 1))]);
      }
    }
  }
  int query(int L, int R) {
    int i = lg[R - L + 1];
    return min(st[i][L], st[i][R - (1 << i) + 1]);
  }
} ST;
 
void SA() {
  string s;
  cin >> s;
  ll k_th;
  cin >> k_th;
  s += "$";
  int n = s.size();
  vector<int> p(n), c(n);
 
    // k = 0
  vector<pair<char, int>> a(n);
  for (int i = 0; i < n; i++) a[i] = {s[i], i};
    sort(a.begin(), a.end());
  for (int i = 0; i < n; i++) p[i] = a[i].second;
    c[p[0]] = 0;
  for (int i = 1; i < n; i++) {
    if (a[i].first == a[i - 1].first) {
      c[p[i]] = c[p[i - 1]];
    } else {
      c[p[i]] = c[p[i - 1]] + 1;
    }
  }
 
  int k = 0;
  while ((1 << k) < n) {
        // k -> k + 1
    for (int i = 0; i < n; i++) {
      p[i] = (p[i] - (1 << k) + n) % n;
    }
    radix_sort(p, c);
    vector<int> c_new(n);
    c_new[p[0]] = 0;
    for (int i = 1; i < n; i++) {
      pair<int, int> prev = {c[p[i-1]], c[(p[i-1] + (1 << k)) % n]};
      pair<int, int> now = {c[p[i]], c[(p[i] + (1 << k)) % n]};
      if (prev == now) {
        c_new[p[i]] = c_new[p[i - 1]];
      } else {
        c_new[p[i]] = c_new[p[i - 1]] + 1;
      }
    }
    c = c_new;
    k++;
  }
  vector<int> lcp(n);
  k = 0;
  for(int i = 0; i < n-1; i++) {
    int pi = c[i];
    int j = p[pi-1];
    while(s[i+k] == s[j+k])k++;
    lcp[pi] = k;
    k = max(k-1, 0);
  }
  ST.init(lcp);
  ll ans = 0;
 
  for(int i = 1; i < n; i++) {
    int st = lcp[i]+1, en = -1;
    if(i < n-1)en = lcp[i+1];
    for(int j = st; j <= en; j++) {
        int lo = i+1, hi = n-1;
        ll t = -1;
        while(lo <= hi) {
            int mid = lo + (hi-lo)/2;
            if(ST.query(i+1, mid) >= j) {
                t = mid;
                lo = mid + 1;
            }else hi = mid - 1;
        }
        ll count = t - i + 1;
        if(ans + count >= k_th) {
            cout << s.substr(p[i], j) << "\n";
            return ;
        }else ans+=count;
    }
    ll used = max(en - st + 1, 0);
    if(ans+n-p[i]-1-lcp[i] - used >= k_th) {
      ll rem = k_th - ans;
      rem+=lcp[i]+used;
      cout << s.substr(p[i], rem) << "\n";
      break;
    }else ans+=(n-p[i]-1-lcp[i]-used);
  }
}
int main() {
  ios::sync_with_stdio(false);
  cin.tie(0);
  int tt;
  tt = 1;
    // cin >> tt;
  while(tt--) {
    SA();
  }
  return 0;
}

Well, I got accepted using a different technique. It is similar to Substring Order I, but instead of traversing the suffix array from the first, now it will traverse the suffix array from the last and search for the (Total substring — k-th + 1)-th smallest from the last.

Prerequisite:

Range Updates and Sums


// https://cses.fi/problemset/task/1735/
#include<bits/stdc++.h>
using namespace std;
#define ll long long
 
struct Segtree {
    // 0 base indexing
    int n;
    vector<pair<ll, ll>> tree;
    vector<bool> mark;
 
    void push(int node) {
        tree[(node << 1)+1].first = tree[(node << 1)+2].first = tree[node].first/2;
        tree[(node << 1)+1].second = tree[(node << 1)+2].second = 0;
        mark[(node << 1)+1] = mark[(node << 1)+2] = true;
        mark[node] = false;
    }
    ll assign(int i, int j, ll value, int node, int l, int r) {
        if(l > j || r < i)return tree[node].first + tree[node].second*(r-l+1);
        if(l >= i && r <= j) {
            tree[node].first = (r - l + 1)*value;
            tree[node].second = 0;
            mark[node] = true;
            return tree[node].first;
        }
        if(mark[node])push(node);
 
        tree[(node << 1)+1].second += tree[node].second, tree[(node << 1)+2].second += tree[node].second;
        tree[node].second = 0;
 
        int mid = l + (r - l)/2;
        return tree[node].first = assign(i, j, value, (node << 1)+1, l, mid) + assign(i, j, value, (node << 1)+2, mid+1, r);
    }
    void assign(int i, int j, ll value) {
        assign(i, j, value, 0, 0, n-1);
    }
    void update(int i, int j, ll value, int node, int l, int r) {
        if(r < i || l > j)return;
        if(l >= i && r <= j) {
            tree[node].second += value;
            return;
        }
        if(mark[node])push(node);
 
        int len = min(r, j) - max(l, i) + 1;
        tree[node].first += len*value;
 
        int mid = l + (r - l)/2;
        update(i, j, value, (node << 1)+1, l, mid), update(i, j, value, (node << 1)+2, mid+1, r);
    }
    void update(int i, int j, ll value) {
        update(i, j, value, 0, 0, n-1);
    }
    ll query(int i, int j, int node, int l, int r) {
        if(r < i || l > j)return 0;
        if(l >= i && r <= j) {
            return tree[node].first + tree[node].second*(r - l + 1);
        }
        if(mark[node])push(node);
 
        tree[node].first+=tree[node].second*(r - l + 1);
        tree[(node << 1)+1].second += tree[node].second;
        tree[(node << 1)+2].second += tree[node].second;
        tree[node].second = 0;
 
        int mid = l + (r - l)/2;
        return query(i, j, (node << 1)+1, l, mid) + query(i, j, (node << 1)+2, mid+1, r);
    }
    ll query(int i, int j) {
        return query(i, j, 0, 0, n-1);
    }
    void init(vector<ll> &a, int _n) {
        n = _n;
        int size = 1;
        while(size < n) size = size << 1;
        tree.assign((size << 1)-1, {0, 0});
        mark.assign((size << 1)-1, false);
        for(int i = 0; i < (int)a.size(); i++) {
            update(i, i, a[i]);
        }
    }
} st;
 
int main() {
    ios::sync_with_stdio(false);
    cin.tie(0);
    int tt;
    tt = 1;
    while(tt--) {
        int n, q;
        cin >> n >> q;
        vector<ll> a(n);
        for(int i = 0; i < n; i++) {
            cin >> a[i];
        }
        n = (1 << (int)ceil(log2(n)));
        st.init(a, n);
        while(q--) {
            int type, a, b;
            ll x;
            cin >> type;
            if(type == 1) {
                cin >> a >> b >> x;
                st.update(--a, --b, x);
            }else if(type == 2) {
                cin >> a >> b >> x;
                st.assign(--a, --b, x);
            }else {
                cin >> a >> b;
                cout << st.query(--a, --b) << "\n";
            }
        }
    }
    return 0;
}

Solution:

Substring Order II


// https://cses.fi/problemset/task/2109/
#include<bits/stdc++.h>
using namespace std;
#define ll long long
 
struct Segtree {
    // 0-base indexing
    int n;
    vector<pair<ll, ll>> tree;
    vector<bool> mark;
 
    void push(int node) {
        tree[(node << 1) + 1].first = tree[(node << 1) + 2].first = tree[node].first / 2;
        tree[(node << 1) + 1].second = tree[(node << 1) + 2].second = 0;
        mark[(node << 1) + 1] = mark[(node << 1) + 2] = true;
        mark[node] = false;
    }
 
    ll assign(int i, int j, ll value, int node, int l, int r) {
        if (l > j || r < i) return tree[node].first + tree[node].second * (r - l + 1);
        if (l >= i && r <= j) {
            tree[node].first = (r - l + 1) * value;
            tree[node].second = 0;
            mark[node] = true;
            return tree[node].first;
        }
        if (mark[node]) push(node);
 
        tree[(node << 1) + 1].second += tree[node].second;
        tree[(node << 1) + 2].second += tree[node].second;
        tree[node].second = 0;
 
        int mid = l + (r - l) / 2;
        return tree[node].first = assign(i, j, value, (node << 1) + 1, l, mid) + assign(i, j, value, (node << 1) + 2, mid + 1, r);
    }
 
    void assign(int i, int j, ll value) {
        assign(i, j, value, 0, 0, n - 1);
    }
 
    void update(int i, int j, ll value, int node, int l, int r) {
        if (r < i || l > j) return;
        if (l >= i && r <= j) {
            tree[node].second += value;
            return;
        }
        if (mark[node]) push(node);
 
        int len = min(r, j) - max(l, i) + 1;
        tree[node].first += len * value;
 
        int mid = l + (r - l) / 2;
        update(i, j, value, (node << 1) + 1, l, mid);
        update(i, j, value, (node << 1) + 2, mid + 1, r);
    }
 
    void update(int i, int j, ll value) {
        update(i, j, value, 0, 0, n - 1);
    }
 
    ll query(int i, int j, int node, int l, int r) {
        if (r < i || l > j) return 0;
        if (l >= i && r <= j) {
            return tree[node].first + tree[node].second * (r - l + 1);
        }
        if (mark[node]) push(node);
 
        tree[node].first += tree[node].second * (r - l + 1);
        tree[(node << 1) + 1].second += tree[node].second;
        tree[(node << 1) + 2].second += tree[node].second;
        tree[node].second = 0;
 
        int mid = l + (r - l) / 2;
        return query(i, j, (node << 1) + 1, l, mid) + query(i, j, (node << 1) + 2, mid + 1, r);
    }
 
    ll query(int i, int j) {
        return query(i, j, 0, 0, n - 1);
    }
 
    void init(vector<ll> &a, int _n) {
        n = _n;
        int size = 1;
        while (size < n) size = size << 1;
        tree.assign((size << 1) - 1, {0, 0});
        mark.assign((size << 1) - 1, false);
        for (int i = 0; i < (int)a.size(); i++) {
            update(i, i, a[i]);
        }
    }
} St;
 
void radix_sort(vector<int> &sa, vector<int> &c) {
    int n = sa.size();
 
    vector<int> cnt(n);
    for (auto x : c) {
        cnt[x]++;
    }
    vector<int> sa_new(n);
    vector<int> pos(n);
    pos[0] = 0;
    for (int i = 1; i < n; i++) {
        pos[i] = pos[i - 1] + cnt[i - 1];
    }
    for (auto x : sa) {
        int i = c[x];
        sa_new[pos[i]] = x;
        pos[i]++;
    }
    sa = sa_new;
}
 
void SA() {
    string s;
    cin >> s;
    ll k_th;
    cin >> k_th;
    s += "$";
    int n = s.size();
    vector<int> sa(n), c(n);
 
    // k = 0
    vector<pair<char, int>> a(n);
    for (int i = 0; i < n; i++) a[i] = {s[i], i};
    sort(a.begin(), a.end());
    for (int i = 0; i < n; i++) sa[i] = a[i].second;
    c[sa[0]] = 0;
    for (int i = 1; i < n; i++) {
        if (a[i].first == a[i - 1].first) {
            c[sa[i]] = c[sa[i - 1]];
        } else {
            c[sa[i]] = c[sa[i - 1]] + 1;
        }
    }
 
    int k = 0;
    while ((1 << k) < n) {
        // k -> k + 1
        for (int i = 0; i < n; i++) {
            sa[i] = (sa[i] - (1 << k) + n) % n;
        }
        radix_sort(sa, c);
        vector<int> c_new(n);
        c_new[sa[0]] = 0;
        for (int i = 1; i < n; i++) {
            pair<int, int> prev = {c[sa[i - 1]], c[(sa[i - 1] + (1 << k)) % n]};
            pair<int, int> now = {c[sa[i]], c[(sa[i] + (1 << k)) % n]};
            if (prev == now) {
                c_new[sa[i]] = c_new[sa[i - 1]];
            } else {
                c_new[sa[i]] = c_new[sa[i - 1]] + 1;
            }
        }
        c = c_new;
        k++;
    }
    vector<int> lcp(n);
    k = 0;
    for (int i = 0; i < n - 1; i++) {
        int pi = c[i];
        int j = sa[pi - 1];
        while (s[i + k] == s[j + k]) k++;
        lcp[pi] = k;
        k = max(k - 1, 0);
    }
    vector<ll> sum(n, 0);
    St.init(sum, n);
    ll count = 0;
    k_th = n * 1LL * (n - 1) / 2 - k_th + 1;
 
    for (int i = n - 1; i >= 0; i--) {
        int lo = lcp[i] + 1, hi = n - sa[i] - 1;
        St.update(1, hi, 1);
        int ans = -1;
        while (lo <= hi) {
            int mid = lo + (hi - lo) / 2;
            if (count + St.query(mid, n - sa[i] - 1) >= k_th) {
                ans = mid;
                lo = mid + 1;
            } else hi = mid - 1;
        }
        if (ans != -1) {
            cout << s.substr(sa[i], ans) << "\n";
            return;
        }
        count += St.query(lcp[i] + 1, n - sa[i] - 1);
        St.assign(lcp[i] + 1, n - sa[i] - 1, 0);
    }
}
 
int main() {
    ios::sync_with_stdio(false);
    cin.tie(0);
    int tt;
    tt = 1;
    // cin >> tt;
    while (tt--) {
        SA();
    }
    return 0;
}

Ber

The idea that the sum max(lcp[i] — lcp[i-1], 0) is linear that you present in the substring order II is wrong. Kasai algorithm for building the lcp array is linear, but it acts on suffixes in decreasing order of size while your algorithm acts on suffixes in lexicographical order. I coded the solution that you described and got AC on CSES, but than I hacked both our solutions with a random string concatenated with itself, which achieves the worst time complexity of O(n²log(n)) (n² different substrings that appear at least twice and we do a binary search on the lcp sparse table for each one).

dutin

Auto comment: topic has been updated by dutin (previous revision, new revision, compare).

11_Nguyen_Thai_Son

I need solution problem special string. Can you help me?

misir

3 years ago, # ^ |

https://github.com/Jonathan-Uy/CSES-Solutions

Thanks you so much. But it only have code, you can give me tutorial for this problem

This problem can be seen as an extension of prefix sums. Run a prefix sum on the string, keeping track of an array of length 26 which is the number of occurrences of each letter. If the current prefix contains all letters in the string and each letter occurs an equal number of times, it is a valid substring. Because each substring can be represented as a differences of prefixes, we find the number of previous prefixes where the excess elements are in excess the exact same amount.

For instance, at the current prefix, we have: a: 4 b: 3 c: 2 d: 2 e: 2 ...

We have to find the number of previous prefixes where a is +2 and b is +1. a: +2 b: +1 c: 0 d: 0 e: 0 ...

This is where having a map comes in handy. The mentioned solution uses a map of vectors, although an array<int, 26> would probably work better.

Jonathan_Uy

toniskrijelj

7. Required Substring:

Code:

Thanks for the solution! Do you mind if I attribute this to you and move this up to the main post?

You can post it, of course :)

too_rusty

All of those are solvable using hashing, except the first one, which has a simpler solution using Trie.

Short

A simpler solution to the first problem.

#include <bits/stdc++.h>

using namespace std;

const int N = (int) 1e6 + 9;
const int mod = (int) 1e9 + 7;

int T[N][26], id = 1, dp[N];
bitset<N> endT;

int main() {
	string s;
	int k;
	cin >> s >> k;
	int n = s.length();
	for (int i = 0; i < k; i++) {
		string t;
		cin >> t;
		int cr = 0;
		for (char& c : t) {
			c -= 'a';
			if (!T[cr][c]) T[cr][c] = id++;
			cr = T[cr][c];
		}
		endT[cr] = 1;
	}
	dp[n] = 1;
	for (int i = n - 1; i >= 0; i--) {
		s[i] -= 'a';
		int cr = 0;
		for (int j = i; j < n && T[cr][s[j]]; j++) {
			cr = T[cr][s[j]];
			if (endT[cr]) {
				dp[i] = (dp[i] + dp[j + 1]) % mod;
			}
		}
	}
	cout << dp[0] << '\n';
	return 0;
}

P.S. Maybe except the 7the one also, I'm not sure.

catsareliquid

How to solve Finding Periods with hashing?

+11

Using polynomial hashes we can get the hash of any substring in $$$\mathcal{O}(1)$$$ time. We can naively check each prefix, and a prefix of length $$$k$$$ takes time $$$\frac n k$$$ time to check. Thus, the problem can be solved in $$$\mathcal{O}(n \ln n)$$$ time.

I didn't know we could get the hashes in $$$O(1)$$$, thank you.

electric_boogaloo

20 months ago, # ^ |

Is the 17. Substring Distribution solvable by hashing? Naively checking all substrings of lengths from 1 to n — 1 is going to result in $$$O(n ^ 2)$$$ time complexity, unless there is some other way?

My code

aditya3435

Problem 1. Word combinations is also solvable using hashing. But it requires a bit of optimization in the inner loop while calculating DP. I also used pragmas to speed up my code.

Since max length can go upto 5000 We don't necessarily need to run inner loop 5000 times. Instead run at max(dictonary[i].size()).

Here's my hashing solution. It took 0.96 seconds to pass.

#pragma GCC optimize("O3,unroll-loops")
#pragma GCC target("avx2,bmi,bmi2,lzcnt,popcnt")
#include <bits/stdc++.h>
#include <atcoder/modint>
using namespace std;

using mint = atcoder::modint1000000007;

const int prime = 31;

int main() {
    clock_t start = clock();

    #ifndef ONLINE_JUDGE
    freopen("input.txt", "r" , stdin);
    freopen("output.txt", "w" , stdout);
    freopen("error.txt", "w" , stderr);
    #endif
    ios_base::sync_with_stdio(false);
    cin.tie(nullptr);

    string S;
    cin >> S;
    int N = S.size();
    int K;
    cin >> K;

    vector<mint> pw(N+1);
    pw[0] = 1;
    for(int i=1; i<=N; i++) pw[i] = prime * pw[i-1];

    auto getHash = [&](string& s){
        mint hsh = 0;
        for(int i=0; i<s.size(); i++) {
            hsh = hsh + mint(s[i] - 'a' + 1) * pw[i];
        }
        return hsh.val();
    };
    vector<mint> hsh(N+1);
    for(int i=0; i<N; i++) {
        hsh[i+1] = hsh[i] + mint(S[i] - 'a' + 1) * pw[i];
    }
    vector<set<int>> st(5001);
    int mx = 0;
    for(int i=0; i<K; i++) {
        string x;
        cin >> x;
        int hsh = getHash(x);
        st[(int)x.size()].insert(hsh);
        mx = max((int)x.size(), mx);
    }
    vector<mint> dp(N+1);
    dp[0] = 1; 
    for(int i=0; i<N; i++) {
        for(int j=1; j <= mx; j++) {
            if(i-j+1 >= 0) {
                mint h = (hsh[i+1] - hsh[i-j+1])/pw[i-j+1];
                if(st[j].find(h.val()) != st[j].end()) {
                    dp[i+1] += dp[i-j+1];
                }
            }
        }
    }
    cout << dp[N].val() << '\n';

    clock_t end = clock();
    double elapsed = (double)(end - start)/CLOCKS_PER_SEC;
    cerr << setprecision(3) << fixed <<  "[Finished in: " << elapsed << "s]";
    return 0;
}

CSES acceped link

TheQuestionMark

(Sorry for necroposting, but this post was already in the recent list and the CSES problemset is fairly popular.)

It's possible to speed this up further. Let $$$M$$$ be the sum of the lengths of the strings in the dictionary. Then, there can be at most $$$O(\sqrt{M})$$$ distinct lengths in the dictionary. We can maintain a map that maps each distinct length $$$l$$$ to the set of hashes of the strings in the dictionary of length $$$l$$$.

To compute the transitions, we can check for each possible length $$$l$$$ if the hash of the suffix of length $$$l$$$ is in the corresponding set of hashes. This gives an algorithm with time complexity $$$O(n\sqrt{M}\log k + M + k\log M)$$$. (You can get rid of the stray logs using unordered versions of stl structures.)

My (not particularly pretty) implementation takes a maximum of 0.1s, without any pragmas.

happyCP

20 months ago, # |

Can you explain why using Z-algorithm can solve Finding Periods? I don't know how to guide to the conclusion that checking i + z[i] >= n is sufficient.

If the suffix starting at $$$i$$$ is equal to the prefix of the string up to $$$i$$$, then $$$s[:i-1]$$$ is a period. You can verify that this works on the sample case ($$$s="abcabca"$$$, $$$z=[0,0,0,4,0,0,1]$$$). I should have mentioned that the string itself is always a period and can be hardcoded.

_HUECTRUM_

9 months ago, # |

For Substring Distribution, we don't really need BFS. Each node represents unique strings for lengths in range [length(link(v)) + 1, length(v)]. Therefore, it's enough to do for(i) {++ans[st[st[i].link].len + 1], --ans[st[i].len + 1] } Example code: boop

Surge226

7 months ago, # |

Can somebody tell me how can we solve Finding Periods question with KMP algorithm?

Realverma

5 months ago, # |

How to Solve Finding Periods using String hashing or Rabin karp?

iamalizaidi

3 months ago, # |

For first one

we can solve it using Trie + DP. I have written simple java code

.ELPoP.

7 weeks ago, # |

Thanks

dutin's blog