[Tutorial] Finding N bits using O(N / log(N)) sums

№	Пользователь	Рейтинг
1	tourist	3985
2	jiangly	3814
3	jqdai0815	3682
4	Benq	3529
5	orzdevinwang	3526
6	ksun48	3517
7	Radewoosh	3410
8	hos.lyric	3399
9	ecnerwala	3392
9	Um_nik	3392

№	Пользователь	Вклад
1	cry	169
2	maomao90	162
2	Um_nik	162
4	atcoder_official	161
5	djm03178	158
6	-is-this-fft-	157
7	adamant	155
8	awoo	154
8	Dominater069	154
10	luogu_official	150

Introduction

Hi everyone, because we noticed there was no good tutorial in English explaining this idea, KrowSavcik and I have decided to write a tutorial blog on it!

This idea finds the values of each element from a binary array using $$$O(N / log(N))$$$ sum queries, more specifically it solves problems of the type:

Given a binary array $$$b$$$ (an array consisting only of zeroes and ones) of $$$n$$$ elements, you can ask queries of the following format:

What is the sum of the subsequence of the values on positions: $$$p_1, p_2, ..., p_k$$$. In other words you will give the subsequence $$$p_1, p_2, ..., p_k$$$ to the interaction and it will return you the value of ($$$b_{p_1} + b_{p_2} + ... + b_{p_k}$$$).

And after asking $$$O(N / log(N))$$$ queries we should be able to tell the value of each index.

The technique was mentioned here too but it was not as detailed and not easy to find.

Concept

The idea is to use a divide and conquer-like approach. Because it's not an intuitive concept, firstly I will have to add some simple notations, so it will be easier to understand it.

$$$x_i$$$ — refers to the position $$$i$$$
$$$A$$$ — any capital letter (except S) refers to a set of points $$$x_i$$$
$$$v_{A} = \sum b_{x_i}$$$ for $$$x_i \in A$$$
$$$k$$$ — the layer we are currently considering
$$$S_i$$$ — Set of queries

Also you should keep in mind that indices are enumerated from 0.

Also keep in mind that even though it's said the queries are used, we actually only query at the end, and we build up our set of queries first for multiple layers first.

As it is a divide and conquer-like approach we will work with layers. Let's say that for layer $$$k$$$ we need $$$2^k$$$ queries and that from it we can get the value of $$$f_k$$$ elements. Then $$$f_{k+1} = 2 \cdot f_k + 2^k - 1$$$.

First of all, let's set our base case, which is $$$k = 0$$$. So, for $$$k = 0, f_0 = 1$$$ and the query set will only be {$$$0$$$}, so we know a single element using a single query.

The block $$$f_{k+1}$$$ is formed from $$$2$$$ blocks of size $$$f_k$$$ and $$$2^k -1$$$ additional elements in this order. Let's say $$$k_1$$$ represents the first block of $$$f_k$$$ elements and the $$$k_2$$$ the second such block.

The first query is used to get the sum on $$$[f_k, 2 \cdot f_k)$$$ — the sum of the second block. Then we add two new queries for each non last query in $$$S_{k_1}$$$ and $$$S_{k_2}$$$. First one is $$$S_{k_1}[i] \cup S_{k_2}[i]$$$. Second one is $$$S_{k_1}[i] \cup ([f_k, 2 \cdot f_k) / S_{k_2}[i]) \cup x_{(2 \cdot f_k + i)}$$$.

The last query is for entire range $$$[0, f_{k+1})$$$.

I think it's easy to see why we now have $$$2^{k+1}$$$ queries. The harder part is to understand why we don't lose any value in this process, and how we can solve it recursively. In fact, having answered all the $$$S_{k+1}$$$ queries, we can calculate all the $$$v_{S_{k_1}[i]}$$$ and $$$v_{S_{k_2}[i]}$$$.

Now, when we reach a $$$k$$$ with a value of $$$f_k >= n$$$ we can stop there. Let's assume $$$n = f_k$$$ since it will be easier to work with it (when $$$n$$$ is smaller than $$$f_k$$$ we can just think of it as appending $$$f_k - n$$$ zeroes at the end since they won't influence the sum at all). There is a more elegant way to form $$$n$$$ from blocks of different sizes, but it doesn't enter the scope of this blog and it's not that hard to implement using offsets. Using the set of queries responsible for the $$$k$$$-th layer we can in fact now reconstruct the whole sequence, now recursively going from the $$$k$$$-th layer to the ($$$k - 1$$$)-th one (but consider each layer can have multiple blocks).

First of all, the only things we care about when we are in the $$$k$$$-th layer are:

The query set for the corresponding block of the corresponding layer.
The block we are currently at (can be dealt with using an offset value in the recursion).

So we will store them when going recursively.

Firstly, let's again set our base case: $$$k = 0$$$. We are now sure that only one element is responsible for this block from this layer, so we can just set the value of the $$$b_x$$$-th bit (where $$$x$$$ is some offset value we use to keep track of the block) to $$$v_{S_0}$$$ (since that's the sum for a single element which we've seen at the build-up).

Now, since the base case is already dealt with, here's how we will go to the ($$$k - 1$$$)-th layer:

We will need to reconstruct the previous query sets for the first block and the second block of size $$$f_{k - 1}$$$, and set the values for the other $$$2^{(k - 1)} - 1$$$ values respectively (since they aren't part of any of the blocks they shouldn't be part of the recursion either).

Let's denote the numbers of $$$1$$$-s in $$$[f_k, 2 \cdot f_k)$$$ with $$$c$$$. It's obvious that $$$c = v_{S[0]}$$$. We will now go through every pair of queries, starting from 1. That means we will be analyzing queries $$$S_1[i] \cup S_2[i]$$$ and $$$S_1[i] \cup ([f_{k-1}, 2 \cdot f_{k-1}) / S_2[i]) \cup x_{(2 \cdot f_{k-1} + i)}$$$.

$$$v_{S[2 \cdot i + 1]} = v_{S_1[i]} + v_{S_2[i]}$$$
$$$v_{S[2 \cdot i + 2]} = v_{S_1[i]} + c - v_{S_2[i]} + b_{2 \cdot f_{k-1} + i}$$$

In this case we have to calculate 3 values: $$$v_{S_1[i]} , v_{S_2[i]} , b_{2 \cdot f_{k-1} + i}$$$.

$$$v_{S_1[i]} = \lfloor\frac{v_{S[2 \cdot i + 1]} + v_{S[2 \cdot i + 2]} - c}{2}\rfloor$$$
$$$v_{S_2[i]} = \lceil\frac{v_{S[2 \cdot i + 1]} - v_{S[2 \cdot i + 2]} + c}{2}\rceil$$$
$$$b_{2 \cdot f_{k-1} + i} = (v_{S[2 \cdot i + 1]} + v_{S[2 \cdot i + 2]} - c) \land 1$$$

The only remaining queries to answer are $$$v_{S_1[2^{k-1}]}$$$ and $$$v_{S_2[2^{k-1}]}$$$ as we didn't add them in $$$S$$$.

$$$v_{S_2[2^{k-1}]} = c = v_{S[0]}$$$
$$$v_{S_1[2^{k-1}]} = v{S[2^k]} - c - \sum_{i = 0}^{2^{k-1}-1} b_{2 \cdot f_{k-1} + i} $$$

So, with all this calculated we are ready to go down to the previous layer's blocks.

Visual representation

Here's a visual representation of the queries we use for going up to the next layer:

The green squares represent the queries responsible for the first block of the previous layer.
The orange squares represent the queries responsible for the second block of the previous layer.
The blue squares represent the queries that query the whole second block excluding the elements from the query of the second block.
The red squares represent the last $$$2^k - 1$$$ bits.

Count of queries (a bit optimized)

Here's a visual representation of how the count of queries grows as $$$n$$$ increases, we notice that the count of queries is around $$$2.67 \cdot n / log2(n)$$$.

Some Code

Keep in mind this code is pretty inefficient and is just used to explain the solution better (it's still functional though)

Firstly, let's set some helper functions to work easier with our query sets:

Helper Functions

vector<int> range(int l, int r) {  
    vector<int> v;
    for(int i = l; i <= r; ++i) v.push_back(i);
    return v;
}
vector<int> lift(vector<int> v, int extra) {
    vector<int> ans;
    for(auto x: v) ans.push_back(x + extra);
    return ans;
}
vector<int> add(vector<int> A, vector<int> B) {
    vector<int> C = A;
    for(auto x: B) C.push_back(x);
    return C;
}
vector<int> erz(vector<int> A, vector<int> B) {
    vector<int> v;
    for(auto x: A) {
        auto it = lower_bound(B.begin(), B.end(), x);
        if(it != B.end() && *it == x) continue;
        v.push_back(x);
    }
    return v;
}
int query(vector<int>& v, int limit) {
    if(v.empty() || v[0] > limit) return 0;
    for(auto x: v) if(x <= limit) cout << x << " ";
    cout << endl;
    int x; cin >> x; return x;
}

The range(l, r) function creates a vector with the whole range of elements from l to r.
The lift function adds an offset to the previous vector.
The add function concatenates 2 vectors, so add(a, b) = a + b.
The erz function makes the reunion of 2 query sets (makes sure no element appears twice).
The query function queries a subsequence, we have to make sure every element in the subsequence is <= n since when constructing the query sets for an $$$n$$$ we could use indices on larger positions (because $$$f_k \geq n$$$).

Now, here's how we do the build-up part:

Build-up

int kk = -1 //kk will be the first k value such that f[kk] >= n
s[0] = {{0}}, f[0] = 1; //assigning the base case
for(int k = 0; k + 1 < K; ++k) {
   //Going from k to k + 1
   f[k + 1] = 2 * f[k] + (1 << k) - 1; //the size of the range it solves now
   vector<int> c = range(f[k], 2 * f[k] - 1);
   s[k + 1].push_back(c);
   vector<int> q;
   for(int i = 0; i < (int)s[k].size() - 1; ++i) {
       //the first query we do for each previous query
       q = add(s[k][i], lift(s[k][i], f[k]));
       s[k + 1].push_back(q); 
       //the second query we do for each previous query.
       q = add(add(s[k][i], erz(c, lift(s[k][i], f[k]))), {2 * f[k] + i}); 
       s[k + 1].pb(q);
   }
   //the final query
   s[k + 1].push_back(range(0, f[k + 1] - 1));
   if(f[k + 1] >= n) {
      kk = k + 1;
      break;
   }
}
//Now we ask the actual values of the queries.
vector<int> qq;
    for(int i = 0; i < (int)s[kk].size(); ++i)
        qq.push_back(query(s[kk][i], n - 1));

And the final part — recursively going from the $$$k$$$-th layer to the ($$$k - 1$$$)-th till each element is visited.

The build-down

void rec(int k, int x, vector<int>& q, int limit) {
    if(k == 0) {
        //the base case
        if(x <= limit) {
           //make sure the element fits between a limit so we don't assign useless values.
            ans[x] = q[0];
        } 
        return;
    }
    vector<int> ql, qr; //the query sets for the previous layer for the left and right blocks respectively
    int a, b, c, q1, q2, sm = 0;
    c = q[0];
    assert(sz(q) % 2 == 0); //we can be sure the query set is even.
    for(int i = 1; i + 1 < sz(q); i += 2) {
        q1 = q[i], q2 = q[i + 1];
        a = (q1 + q2 - c) / 2;
        b = q1 - a;
        if(2 * f[k - 1] + x + (i / 2) <= limit) {
            //assigning the value of the bit in the third block (in the last 2^k - 1 elements)
            ans[2 * f[k - 1] + x + (i / 2)] = (q1 + q2 - c) & 1;
        }
        sm += (q1 + q2 - c) & 1;
        ql.push_back(a);
        qr.push_back(b);
    }
    a = q[sz(q) - 1] - c - sm;
    ql.pb(a);
    qr.pb(c);
    rec(k - 1, x, ql, limit); //recursively going to the first block of the previous layer
    rec(k - 1, x + f[k - 1], qr, limit); //recursively going to the second block of the previous layer
}

Rev.	Кто	Когда	Δ	Комментарий
en18	SlavicG	2022-07-29 12:41:24	8	Tiny change: '+ 2]} - c)$ & $1$\n\nThe ' -> '+ 2]} - c) \land 1$\n\nThe '
en17	SlavicG	2022-07-29 12:40:12	12	Tiny change: 'that $c = S[0]$. We ' -> 'that $c = v_S[0]$. We '
en16	KrowSavcik	2022-07-24 13:28:54	62	Minor correction
en15	SlavicG	2022-07-24 12:37:17	0	(published)
en14	SlavicG	2022-07-24 12:35:10	46	Tiny change: 'in A$\n* $(A)$ — a set in brackets is a query\n* $k$ &m' -> 'in A$\n* $k$ &m'
en13	SlavicG	2022-07-24 12:34:34	389
en12	KrowSavcik	2022-07-24 12:29:11	253
en11	SlavicG	2022-07-24 11:55:40	21	Tiny change: 's detailed.\n\n### ' -> 's detailed and not easy to find.\n\n### '
en10	SlavicG	2022-07-24 11:54:53	4711	Tiny change: 'nd $1.85 \ cdot n / l' -> 'nd $1.85 \cdot n / l'
en9	KrowSavcik	2022-07-24 11:07:59	258	Added gifs
en8	KrowSavcik	2022-07-24 10:53:28	379	Tiny change: 'if)\n![ ](https://codeforces.net/2fca27/op2.gif)\n![ ](ht' -> 'if)\n![ ]()\n![ ](ht'
en7	KrowSavcik	2022-07-24 08:57:55	1261	Tiny change: ' i)}$.\n\n' -> ' i)}$.\n\n* $v_{S[2 \cdot i + 1]} = v_{S_1[i]} + v_{S_2[i]}$\n\n'
en6	SlavicG	2022-07-23 22:02:47	19
en5	SlavicG	2022-07-23 21:20:21	1540	Tiny change: 'cdot f_k) \cap S_{k_2}[i' -> 'cdot f_k) / S_{k_2}[i'
en4	SlavicG	2022-07-23 20:53:50	766	Tiny change: 'ated from $0$\n\nAs it' -> 'ated from 0\n\nAs it'
en3	KrowSavcik	2022-07-23 20:17:23	783	Tiny change: 's $S_{k_1} \cup S_{k' -> 's $S_{k_1}[i] \cup S_{k'
en2	KrowSavcik	2022-07-23 19:51:40	732	Tiny change: '.\n\n```\nq_i\n```\n\n' -> '.\n\n```\n$q_i$\n```\n\n'
en1	SlavicG	2022-07-23 19:21:02	1001	Initial revision (saved to drafts)

Introduction

Concept

Visual representation

Count of queries (a bit optimized)

Some Code

История