Offline Range MEX queries in O(log n)

#	User	Rating
1	tourist	4009
2	jiangly	3823
3	Benq	3738
4	Radewoosh	3633
5	jqdai0815	3620
6	orzdevinwang	3529
7	ecnerwala	3446
8	Um_nik	3396
9	ksun48	3390
10	gamegame	3386

#	User	Contrib.
1	cry	167
2	Um_nik	163
2	maomao90	163
4	atcoder_official	161
5	adamant	159
6	-is-this-fft-	158
7	awoo	157
8	TheScrasse	154
9	nor	153
9	Dominater069	153

Consider the following problem :

Your are given an array $$$a$$$ of size $$$n$$$ and $$$q$$$ queries of form $$$(l_1, r_1), (l_2, r_2), ..., (l_q, r_q)$$$. For each query you have to find the MEX of the array $$$a_l, a_{l+1}, ..., a_r$$$. The queries are offline.

$$$0 \le a_1, a_2, ..., a_n \le n$$$

$$$1 \le l_i \le r_i \le n$$$, for all $$$1 \le i \le q$$$

The approach to the problem is discussed in the comment here.

First let's consider a special case of the queries where $$$r = n$$$. In this case we want to find the smallest integer that does not occur in the subarray $$$a_l, ..., a_r$$$. That is, the smallest integer that does not occur at index $$$l$$$ or after it.

For all $$$0 \le x \le n$$$, let $$$b_x$$$ be the greatest index $$$i$$$ such that $$$a_i = x$$$, and $$$0$$$ if no such $$$a_i$$$ exists.

Now the MEX say $$$m$$$ of subarray $$$a_l, ..., a_n$$$ either does not occur in $$$a$$$, in which case we have $$$b_m = 0$$$, or the last occurence of $$$m$$$ in $$$a$$$ is before $$$l$$$, thus $$$b_m < l$$$. We can say that $$$m$$$ is the minimum of all such numbers, thus $$$m$$$ = $$$min$$$ { $$$i$$$ | $$$b_i < l$$$ }.

Now since the queries are offline, we can sort the queries by $$$r$$$ and process them such that we can ignore the subarray $$$a_{r + 1}, ..., a_n$$$. We iterate from from $$$r = 1, 2, ..., n$$$, and for each $$$r$$$ we process all queries $$$(l_i, r_i)$$$ where $$$r_i = r$$$.

Now how do we efficiently calculate the answer for queries $$$(l, n)$$$ ?

We maintain a minimum segment tree on the array $$$b$$$ and update all values for $$$i = 1, 2, ..., r$$$. Now since there are $$$n$$$ updates but $$$n + 1$$$ leaf nodes in the segment tree, at least one of them is zero. So value of root node is $$$0$$$. Now we try to find the minimum index leaf node with value < $$$l$$$. We start at root node and travel down to this node. At each node we see if the value of it's left child is < $$$l$$$. If it is we go to left child. Otherwise we go to the right child. Either of left or right child will always have value < $$$l$$$. Finally we reach a leaf node, the index of this node(after adjusting for offset) is the mex of array $$$a_l, a_{l + 1}, ..., a_r$$$.

Consider the array $$$a = [6, 1, 0]$$$.

Here $$$b = [3, 2, 0, 0, 0, 0, 1, 0]$$$.

Consider evaluation for $$$l = 2, r = 3$$$. The subarray is $$$[a_2, a_3] = [1, 0]$$$.

We start at root node.

We see that the left child has value < $$$l$$$, so we go to the left child.

Now for left node $$$2 \ge l$$$. So we go to right node.

Now for the left child value < $$$l$$$. So we go to left child.

We have reached a leaf node. The corresponding index is 2 and thus the MEX of the subarray $$$[a_2, a_3]$$$ is 2.

Calculating MEX of each query takes $$$O(log n)$$$ time, total $$$O(q log n)$$$. Updating the tree for $$$b$$$ takes $$$O(n log n)$$$ time in total. Sorting $$$q$$$ queries takes $$$O(q log q)$$$ time.

Overall time complexity — $$$O(n log n + q(log n + log q))$$$

#include<bits/stdc++.h>
using namespace std;
#define ll long long
struct SegmentTree{
    int n;
    vector<int> tree;
    SegmentTree(int sz){
        n = 1;
        while(n < sz){
            n <<= 1;
        }
        tree.assign(2 * n, 0);
    }
    void set(int ind, int val){
        ind += n;
        tree[ind] = val;
        ind >>= 1;
        while(ind > 0){
            tree[ind] = min(tree[2 * ind], tree[2 * ind + 1]);
            ind >>= 1;
        }
    }
    int get(int x){
        // return the first index i, such that s[i] < x
        int node = 1;
        while(node < n){
            int left = (node << 1);
            int right = (node << 1) + 1;
            if(tree[left] < x){
                node = left;
            }else{
                node = right;
            }
        }
        return (node - n);
    }
};
int main(){
    int n;
    cin >> n;
    vector<int> a(n + 1);
    for(int i = 1; i <= n; ++i){
        cin >> a[i];
    }
    int q;
    cin >> q;
    vector<vector<pair<int, int>>> queries(n + 1);
    for(int i = 0; i < q; ++i){
        int l, r;
        cin >> l >> r;
        queries[r].push_back({l, i});
    }

    vector<int> res(q);

    SegmentTree s(n + 1);
    for(int i = 1; i <= n; ++i){
        // set the last occurence of a[i] to i
        s.set(a[i], i);

        for(auto [l, ind] : queries[i]){
            // find the smallest x, such that last occurence of x < l
            res[ind] = s.get(l);
        }
    }
    for(int elem : res){
        cout << elem << '\n';
    }
    cout << '\n';

    return 0;
}

Comments (8)

Write comment?

one_autum_leaf

17 months ago, # |

Auto comment: topic has been updated by one_autum_leaf (previous revision, new revision, compare).

→ Reply

igzou

5 months ago, # |

Practice question

https://leetcode.com/problems/smallest-missing-genetic-value-in-each-subtree/description/

hydra_cody

4 months ago, # |

← Rev. 2 →

How to solve this

I am thinking of making dfs tree array and work on this array. But how do we solve range MEX queries with update on array?

Hamza_Hassanain

4 months ago, # ^ |

Did you think of square root decomposition?

How will you combine results from two square roots? We can't work with frequency for each square root.

lemelisk

But how do we solve range MEX queries with update on array?

Can be done easily in $$$\mathcal{O}(n^{5/3}+q\sqrt{n})$$$ via 3D Mo and a bit harder in $$$\mathcal{O}(n \log{n}+q\log^2{n})$$$. For the second approach, if the value of $$$x$$$ is at positions $$$p_1, p_2 \dots p_k$$$, consider segments $$$[1; p_1 - 1]$$$, $$$[p_1+1, p_2-1]$$$ ... $$$[p_{k-1}+1;p_k - 1]$$$, $$$[p_k+1, n]$$$ with the value of $$$x$$$ associated with these segments. There will be a total of $$$\mathcal{O}(n)$$$ such segments, and each update changes only $$$\mathcal{O}(1)$$$ of them. And for a range MEX query, we need to find the segment with the smallest $$$x$$$ for all the segments in which the query segment is nested. This can be done for $$$\mathcal{O}(\log^2{n})$$$ in various ways using data structures.

abd-alrzaq

practice https://codeforces.net/gym/105242/problem/A

saturina

Why is this blog flop?

one_autum_leaf's blog