FHVirus's blog

By FHVirus, history, 3 years ago, In English

Hello CodeForces! Since my contribution is negative currently (-4) due to a stupid comment I made, I would like to write a little tutorial to earn some contribution stonks.

Anyways, here's the outline of this tutorial:

  1. We're going to solve a simple problem and a not-so-simple problem, then
  2. use the formula derived from the problems to solve "Historic Sum" problem and maintain it on data structures like segment trees.
  3. Use the same technique to solve the "Range Add Range Sum" problem with BIT. Yes, it's possible!

Disclaimer: I did not come up with the method myself. Tutorials on historic sum problem can be found here from jiry_2's blog, and range add range query BIT can also be found on the Internet. I forgot the link to the original post, but the same technique can be found in several different Chinese blogs.

This blog serves as an implementation of Feynman's learning technique, and I hope that it makes some concepts and reasoning behind the cool method much clearer for myself and anyone reading this blog.

It will be a bit long (I want to make everything clear for even a beginner), so maybe you'll want to do some binary jumping and skip ahead for certain parts of the explanation while reading.

Note:

  1. I'm using 1-based index on arrays and closed bounds for intervals.
  2. Section 2 and 3 are independent.
  3. The markdown and math blocks might have some problem, so here's a hackmd version.

1. The essence.

1.1 The simple problem

The problem looks like this:

Given an interger array $$$A$$$ of length $$$n$$$.
We define array $$$B$$$ where $$$B_i = \sum_{j = 1}^{i}{A_j}$$$.
Then we define another array $$$C$$$ where $$$C_i = \sum_{j = 1}^{i}{B_j}$$$.
Then there's $$$q$$$ queries, each of two types:
1. Print $$$C_n$$$.
2. Add $$$x$$$ to $$$A_k, k \in [1, n]$$$.

If there's only queries of type 1, two times of prefix sum will do the job. But what about modifications?

The key observation here is to play with the summations.

$$$ C_i = \sum_{j = 1}^{i}{B_j} = \sum_{j = 1}^{i}{\sum_{k = 1}^{j}{A_k}} \\ = \begin{matrix} (A_1) & + \\ (A_1 & + & A_2) & + \\ (A_1 & + & A_2 & + & A_3) & + \\ \vdots & & \vdots & & \vdots \\ (A_1 & + & A_2 & + & A_3 & + & \cdots & + & A_i) \end{matrix} $$$

When we switch our view from horizontal to vertical, we can se that there's no need to do two times of summation (prefix sum). With a bit of math, we can count the number of appearance of each $$$A_i$$$ in the equation (which is also the height of column $$$i$$$) and thus get a simpler formula:

$$$ C_i = \sum_{j = 1}^{i}{B_j} = \sum_{j = 1}^{i}{\sum_{k = 1}^{j}{A_k}} = \sum_{j = 1}^{i}{(i - j + 1) \cdot A_j} $$$

And for some reason, let's call the $$$(i - j + 1)$$$ term the "contribution multiplier" of $$$A_j$$$ to $$$C_i$$$.

There's two important property of this formula:

  1. It can be done in one pass for loop.
  2. Every element $$$A_i$$$ is independent. That is, for queries of type 2, a modification will be done on single point.

And the solution to the problem looks like this:

  1. Calculate $$$C_n$$$, and store the value sum in a single int or long long.
  2. sum += x * (n - k + 1);

Complexity: $$$O(n + q)$$$ time, $$$O(1)$$$ space.

1.2 The not-simple problem

The problem is pretty much the same, but for type 1 queries, we need to answer $$$C_i$$$ for arbitrary index $$$i$$$ instead of fixed index $$$n$$$. This means that the contribution multiplier is no longer fixed for one element in array $$$A$$$.

And the key is to play with summations again. But this time, since the difficulty stems from the index of queried $$$C_i$$$, let's try to make it independent from that of $$$A_j$$$:

$$$ C_i = \sum_{j = 1}^{i}{(i - j + 1) \cdot A_j} = (i + 1) \cdot \sum_{j = 1}^{i}{A_j} - \sum_{j = 1}^{i}{j \cdot A_j} $$$

Putting the modification aside, all we need to do is maintain the prefix sum $$$B_i = \sum_{j = 1}^{i}{A_j}$$$ and another prefix sum $$$B'_i = \sum_{j = 1}^{i}{j \cdot A_j}$$$, and then $$$C_i$$$ will be $$$(i + 1) \cdot B_i - B'_i$$$.

What about modifications? We simply maintain $$$B$$$ and $$$B'$$$ on two Binary Indexed Trees (or one, it's probably faster and better), and the modification becomes two point add operations.

Complexity: $$$O(n + q \log n)$$$ time, $$$O(n)$$$ space.

Note: It's possible to initialize a BIT in linear time. Read this blog for more information.

The code is very simple, so I would like to be lazy here and omit it.


2. Historic sums & maintaining it on data structures.

The problem goes like this:

Given two identical arrays $$$D$$$ and $$$E$$$ of length $$$n$$$. There are two types of queries:
1. Add $$$x$$$ to $$$D[l, r]$$$. After that, add $$$D_i$$$ to $$$E_i$$$ for all $$$i \in [1, n]$$$.
2. Print the sum of $$$E[l, r]$$$.

It's a bit hard, so let's solve an easier version where $$$n = 1$$$ first. Hint: It's almost the same as the not-so-simple problem.

Solution

There's another inportant property here: if we modified $$$A_p$$$, we will never ask for $$$C_q$$$ where $$$p > q$$$. So all the information we need can be maintained by a single long long.

And now moving on to the original problem with bigger $$$n$$$. It's obvious that we just need to use a segment tree to maintain $$$B_i$$$ and $$$B'_i$$$ for every cell in $$$E$$$, and the modifications will be two ordinary range add queries.

Another method of solving this problem is to maintain a tag of "how many modifications have passed after last pushed down the addition tag." It's more complicated but can also be used in more situations. I'm not going to explain it here because it's irrelevent the main idea of this blog, and if you want to learn about that, please read smax's blog.

For more sophisticated problems about historic sum, refer to jiry_2's blog.


3. Range Add Range Sum BIT

First, let's convert the problem into suffix add prefix sum.

Given an array $$$F$$$ of length $$$n$$$:
1. Add(p, x): Add $$$x$$$ to F_i for all $$$i \in [p, n]$$$.
2. Sum(p): Print $$$\sum_{i = 1}^{p}{F_i}$$$.

Define an array $$$G$$$ where $$$G_i = \sum_{j = 1}^{i}{F_j}$$$. For each query of type 1, it will contribute $$$(i - p + 1) \cdot x$$$ to $$$G_i$$$ for every $$$i >= p$$$. Doesn't it seem similar to the not-so-simple problem?

And voilà! We can do range add range sum with BIT! There's not much to explain, so here's the code.

Code

And that's it for this tutorial!

Special thanks to 8e7 for proofreading.

  • Vote: I like it
  • +203
  • Vote: I do not like it

»
3 years ago, # |
  Vote: I like it +5 Vote: I do not like it

Auto comment: topic has been updated by FHVirus (previous revision, new revision, compare).

»
3 years ago, # |
  Vote: I like it +24 Vote: I do not like it

Nice tutorial, it's well organized and easy to follow!

For the historic sums part, I described 'maintain a tag of "how many modifications have passed after last pushed down the addition tag'" in a blog I wrote a few months ago (shameless plug) along with a few examples if anyone's interested.

»
3 years ago, # |
Rev. 7   Vote: I like it +5 Vote: I do not like it

(sorry for bad English+this might be different with the explanation in this blog)

range sum range query with BIT explanation:

First, change the queries.

=> 1. add x to F_k,F_k+1,...F_N

=> 2. print F_1+F_2+...+F_k

The original queries can be processed with these queries.

Consider how much "sum of elements up to the mth element" increases for each m = 1, 2, ..., N when adding x to each k, k+1, ..., Nth element.

-if m<k, the sum of the elements does not change until the mth.

-otherwise, the sum of the elements up to the mth will increase (m-k+1)x.

this can be seen as a BIT which has a linear function f(m)=xm+(-k+1)x in its kth cell. Since the sum of two linear functions are still a linear function, you can create a BIT that stores the coefficients of the linear term in each cell, and another BIT which stores the constant term in each cell. In the tree created this way, another linear function is obtained by calculating the sum of elements-linear functions- from the first to the kth, and the calculated value by putting k into the variable of the linear function is the answer to the second query.

original blog of this comment(by jh05013)

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by FHVirus (previous revision, new revision, compare).

»
2 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

sorry i just read the second section and found out that i misread the first one and my solution is almost like his,