Exploring a variation of sqrt decompostion

#	User	Rating
1	jiangly	3846
2	tourist	3799
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3590
6	Ormlis	3533
7	Benq	3468
8	Radewoosh	3463
9	ecnerwala	3451
9	Um_nik	3451

#	User	Contrib.
1	cry	165
2	-is-this-fft-	161
3	Qingyu	160
4	atcoder_official	157
5	Dominater069	156
6	adamant	154
7	Um_nik	151
7	djm03178	151
9	luogu_official	149
10	awoo	147

Hello Codeforces,

In this blog, I want to discuss a technique I devised while solving the problem ABC392F Insert. Although I couldn't solve it during the contest due to time constraints, I later implemented a solution with the help of ChatGPT and it received AC. (I do feel the test cases are weak and a solution like this might not pass on tougher tests.) The editorial mentioned PBDS as a faster alternative.

Anyhow, I'm curious if you have seen this idea before, please share the relevant links and acknowledge the original source. Your feedback and insights are very welcome!

Problem Statement

Given n integers, insert them at their respective positions in the array built so far.

For i = 1, 2, …, n, perform the following operation in order: - Insert the number i into the array A so that it becomes the P[i]-th element from the beginning.
In other words, replace A with the concatenation of: 1. The first P[i] — 1 elements of A 2. The number i 3. The remaining elements of A starting from the P[i]-th element

Constraints: - 1 ≤ n ≤ 5×10^5 - 1 ≤ P[i] ≤ i - All input values are integers.

Approach: √n Decomposition

The main idea is to maintain the evolving array A as a list of blocks (vectors) where each block has roughly O(√n) elements. This allows for efficient arbitrary insertions:

Insertion:
For each number, determine the block and the exact position within that block by subtracting block sizes until the position is found. This takes O(√n) time per insertion.
Balancing:
After each insertion, check if a block's size reaches 2 × √n. If so, split the block into two to maintain the block sizes around O(√n), ensuring subsequent insertions remain efficient.
Overall Complexity:
With n insertions each taking O(√n), the total time complexity is O(n√n).

Code

Submission link, AC

void solve() {
    int n;
    cin >> n;

    vector<vector<int>> arr;
    const int blk = max(1, (int)sqrt(n));
    for (int i = 0; i < n; i++) {
        int x;
        cin >> x;
        x--;  // Convert to 0-indexed

        int b = 0;
        if (i == x) {
            if (arr.empty()) {
                arr.push_back({ i + 1 });
            } else {
                b = arr.size() - 1;
                arr[b].push_back(x + 1);
            }
        } else {
            while (x > arr[b].size()) {
                x -= arr[b].size();
                b++;
            }
            arr[b].insert(arr[b].begin() + x, i + 1);
        }

        if (arr[b].size() == 2 * blk) {
            vector<int> t(arr[b].begin() + blk, arr[b].end());
            arr[b].resize(blk);
            arr.insert(arr.begin() + b + 1, t);
        }
    }
    for (auto& block : arr)
        for (int num : block)
            cout << num << " ";
    cout << "\n";
}

Comments (8)

Write comment?

yoshi_likes_e5

4 weeks ago, # |

Some other ways to solve the problem:

use deques of size $$$O(\sqrt{n})$$$.
use a binary search tree, e.g. splay or treap.

There's a general idea of rebuilding the data structure every $$$O(\sqrt{n})$$$ queries, I think that this idea is simmilar in a way that it makes the block size being bounded to $$$O(\sqrt{n})$$$, but more efficient because it only splits a single block.

→ Reply

risingStark

4 weeks ago, # ^ |

The dequeue solution I think is a little harder to implement. Now, to insert I am using the in-built insert method but for dequeue I will have to pop and push those elements myself.

arvindf232

this is well-known in china. However, this is not as well known in CF because the amount of sqrt problems is far less on CF.

it is hard to link to a blog because you can see this trick just about everywhere in any DS problem intending sqrt solutions that involve inserting elements.

mathzz

could you recommend some chinese learning resources?

19.30

During the contest, I tried using the GNU PBDS rope data structure, but it resulted in TLE. After the contest, I solved it using a Treap as well as a Splay Tree. But this implementation is far better than those two data structures in both time and memory complexity.

The reason I got TLE during the contest might be that the rope data structure is not a self-balancing tree. Balancing the tree costs O(N) time. Then I wondered, how does it claim O(log N) insertion, erasure, and random access?

stdfloat

I think 455D - Serega and Fun solvable with that.

bashkort

I learned this technique about 4 years ago, but I never really implemented it since it wasn't useful in any problem I encountered :)

risingStark's blog

Problem Statement

Approach: √n Decomposition

Code