Introduction
Hi everyone, because we noticed there was no good tutorial in English explaining this idea, KrowSavcik and I have decided to write a tutorial blog on it!
This idea finds the values of each element from a binary array using $$$O(N / log(N))$$$ sum queries, more specifically it solves problems of the type:
Given a binary array $$$b$$$ (an array consisting only of zeroes and ones) of $$$n$$$ elements, you can ask queries of the following format:
What is the sum of the subsequence of the values on positions: $$$p_1, p_2, ..., p_k$$$. In other words you will give the subsequence $$$p_1, p_2, ..., p_k$$$ to the interaction and it will return you the value of ($$$b_{p_1} + b_{p_2} + ... + b_{p_k}$$$).
And after asking $$$O(N / log(N))$$$ queries we should be able to tell the value of each index.
The technique was mentioned here too but it was not as detailed.
Concept
The idea is to use a divide and conquer like approach. Because it's not an intuitive concept, firstly I will have to add some simple notations, so it will be easier to understand it.
- $$$x_i$$$ — refers to the position $$$i$$$
- $$$A$$$ — any capital letter (except S) refers to a set of points $$$x_i$$$
- $$$v_{A} = \sum b_{i}$$$ for $$$x_i \in A$$$
- $$$(A)$$$ — a set in brackets is a query
- $$$S_i$$$ — Set of queries
As it is a divide and conquer like approach we will work with layers. Let's say that for layer $$$k$$$ we need $$$2^k$$$ queries and that from it we can get the value of $$$f_k$$$ bits. Than $$$f_{k+1} = 2 f_k + 2^k - 1$$$.
The block $$$f_{k+1}$$$ is formed from $$$2$$$ blocks of size $$$f_k$$$ and $$$2^k -1$$$ elements. The first query is used to get the sum on $$$[f_k;2f_k)$$$ second block. Then for each non last query in $$$S_{k_1}$$$ and $$$S_{k_2}$$$ we add two new queries. First is $$$S_{k_1}[i] \cup S_{k_2}[i]$$$. Second one is $$$S_{k_1}[i] \cup [f_k;2f_k) \cap S_{k_2}[i] \cup x_{2f_k+i}$$$.
The last query is for entire range $$$[0, f_{k+1})$$$.
I think it's easy to see why we now have $$$2^{k+1}$$$ queries. The harder part is to understand why we don't lose any value in this process, and how we can solve it recursively. In fact, having answered all the $$$S_{k+1}$$$ queries, we can calculate all the $$$v_{S_{k_1}[i]}$$$ and $$$v_{S_{k_2}[i]}$$$.
Let's denote the numbers of $$$1$$$s in $$$[f_k;2f_k)$$$ with $$$c$$$. It's obvious that $$$c = S_{k+1}[0]$$$.