Hard range queries - Codeforces

#	User	Rating
1	jiangly	3898
2	tourist	3840
3	orzdevinwang	3706
4	ksun48	3691
5	jqdai0815	3682
6	ecnerwala	3525
7	gamegame	3477
8	Benq	3468
9	Ormlis	3381
10	maroonrk	3379

#	User	Contrib.
1	cry	167
2	-is-this-fft-	165
3	Dominater069	161
4	atcoder_official	159
4	Um_nik	159
6	djm03178	156
7	adamant	153
8	luogu_official	150
9	awoo	149
10	TheScrasse	146

How to solve this?

You have queries of the following three types:

$$$add(i, v)$$$: add an element with value $$$v$$$ at index $$$i$$$.
$$$query(l, r)$$$: count the number of distinct values of elements on the range from $$$l$$$ to $$$r$$$ (inclusive).
$$$remove(i, v)$$$: remove one of the elements with value $$$v$$$ from the index $$$i$$$.

Notes:

let $$$Q$$$ be the number of queries of first and the third types, and $$$Q'$$$ be the number of queries of the second type, $$$1 \le Q \le 10^5$$$, $$$1 \le Q' \le Q\space log\space Q$$$.
$$$1 \le v \le 10^5$$$.
$$$0 \le i, l, r \le 10^8\space\space(l \le r)$$$.
in $$$add$$$ queries some index might have multiple elements at the same time and some may share the same value.
in $$$remove$$$ queries it is guaranteed that there will be at least one value at index $$$i$$$ which equals $$$v$$$.
$$$query$$$ queries should be preformed online, but the the other two types can be preprocessed if needed.
notice the unusual constrains over $$$l$$$, $$$r$$$ and $$$i$$$.

Is there is any way to do this in $$$O(log\space n)$$$ time for the queries of the second type and $$$O(log^2 n)$$$ or faster for the first the third types?

Comments (9)

Show archived | Write comment?

tdpencil

4 years ago, # |

← Rev. 2 →

What is the limit on a[i]?

If a[i] is in the hundreds or so, then you can just iterate from 1 to the limit of a[i], and then binary search whether it's within the range.

If you can't, my next thought would be to use a segment tree to answer queries. A data structure called "merge-sort tree" (which is similar to a segment tree) stores a subarray of information (requires n log n memory). At this point, it is possible to use binary search to answer queries, or possible to use fractional cascading as an alternative (see Fractional Cascading ).

For remove queries, it is much harder as you would have to remove that number from log n subarrays. It might just be better to use a boolean array to check if that number has been used.

These are some of the ideas that I have, but they're not definitive. Do you know what the time limit of this problem is in? Normally, the size of the array would not be so large.

→ Reply

too_rusty

4 years ago, # ^ |

All the constrains are listed in the notes.

The time limit is about 2 seconds.

Nots0fast

Can you post a link to the problem ?

+14

It is not a problem in itself.

This is what I was trying to solve: Problem A.

mango_lassi

+35

The best algorithm I can come up with is $$$\mathcal{O}(\log^2 n)$$$ per query (which should still pass).

First, compress indices so that we have exactly one add and exactly one remove operation at every index. Say the value that is added at position $$$i$$$ is $$$v_i$$$.

Call a value active if it has been added but not deleted yet. Maintain an array where at position $$$i$$$ you have $$$-1$$$ if it is not active, and otherwise the next position $$$j$$$ which is active and has $$$v_i = v_j$$$. You can maintain these values in $$$\mathcal{O}(\log n)$$$ per query.

Now, you can count the number of distinct active values $$$v_i$$$ in range $$$[a, b]$$$ by counting the number of values in $$$[a, b]$$$ that are greater than $$$b$$$.

To do this, maintain a segment tree, and at every segment tree node a indexed set of the current values below. You can count the number of values in $$$[a, b]$$$ greater than $$$b$$$ in $$$\log n$$$ queries of the type how many values in this set are greater than $$$b$$$ which you can answer in $$$\log n$$$ time each, for $$$\mathcal{O}(\log^2 n)$$$ time per query.

Well I thought of something similar, but the problem with this (and the reason why I asked for a $$$O(log\space n)$$$ solution) is that I do the queries of the second type during a binary search which makes it $$$O(log^3 n)$$$ (I re-worked the constrains so that it is more clear now).

I also couldn't find a way to compress the indices so that exactly one add and remove operations happens at each index even if the $$$query$$$ queries were offline, because we will have to map them to the correct new indices (i.e. we have to extend or shrink the ranges of the queries) which was hard for me. Can you describe a way to do this compression?

And can we replace the segment tree of indexed set in each node with another data structure that can do the same queries (how many values are greater than $$$b$$$) faster, and also support the same updates (even if the updates would be slower like $$$O(log^2 n)$$$)?

Here's code for compressing the indices. It's basically just a bunch of binary searching. The second value in comp represents if we have added at that position and if we have deleted at that position.

code

#include <bits/stdc++.h>
using namespace std;
using ll = long long;

// returns number of elements strictly smaller than v in vec
template<class T>
int bins(const vector<T>& vec, T v) {
	int low = 0;
	int high = vec.size();
	while(low != high) {
		int mid = (low + high) >> 1;
		if (vec[mid] < v) low = mid + 1;
		else high = mid;
	}
	return low;
}

int main() {
	ios_base::sync_with_stdio(false);
	cin.tie(0);

	int n, q13, q2;
	cin >> n >> q13 >> q2;
	int q = q13 + q2;

	vector<pair<int, pair<int, int>>> qs;
	vector<pair<pair<int, int>, int>> comp;
	for (int j = 0; j < q; ++j) {
		int t;
		cin >> t;
		
		if (t == 1 || t == 3) {
			int i, v;
			cin >> i >> v;
			--i;

			qs.emplace_back(t, make_pair(i, v));
			if (t == 1) comp.emplace_back(qs.back().second, 0);
		} else {
			int a, b;
			cin >> a >> b;
			--a; --b;

			qs.emplace_back(2, make_pair(a, b));
		}
	}

	// Compress indices: after this, every position has exactly one add, and at most one delete.
	sort(comp.begin(), comp.end());
	for (int j = 0; j < n; ++j) {
		int t = qs[j].first;
		if (t == 1 || t == 3) {
			int c;
			if (t == 1) c = bins(comp, make_pair(qs[j].second, 1)) - 1;
			else c = bins(comp, make_pair(qs[j].second, 2)) - 1;
			++comp[c].second;
			qs[j].second.first = c;
		} else {
			int a, b;
			tie(a, b) = qs[j].second;

			a = bins(comp, make_pair(make_pair(a, 0), 0));
			b = bins(comp, make_pair(make_pair(b + 1, -1), -1)) - 1;
			qs[j].second = {a, b};
		}
	}
	n = comp.size();

	// Output compressed input
	cout << n << ' ' << q13 << ' ' << q2 << '\n';
	for (int j = 0; j < q; ++j) {
		cout << qs[j].first << ' ' << qs[j].second.first << ' ' << qs[j].second.second << '\n';
	}
}

I didn't think about it this way at all!! With this I can also do the compression for the queries online, right?

I can store only the $$$add$$$ and $$$remove$$$ queries in comp and do a binary search to find where $$$l$$$ and $$$r$$$ should be at each query. (??)

Yes, you only need to know where the add operations will be.

too_rusty's blog