this blog will talk about prefix sums in general , how to use them , nice tricks , etc ...
this blog was made to summarize HCPC-2025 training for prefix sums in Homs university / Syria.
overview
1. what is prefix sum?
2. get range sum in $$$O(1)$$$
3. find number of sub arrays with sum equal to $$$x$$$ (map trick)
4. even more prefix sums
lets start:
1. what is prefix sum?
prefix sum of an array a of size $$$n$$$ is another array of size $$$n+1$$$ as follows:
a(1) a(2) a(3) ... a(n)
p(0)=0 p(1)=a(1) p(2)=a(1)+a(2) p(3)=a(1)+a(2)+a(3) ... p(n)=a(1)+a(2)+...+a(n)
so in general $$$p(i)$$$ contains the sum of all elements up to $$$i$$$
and we introduce another element at the front representing the empty prefix
we can build prefix sum array in simple one line of code
for(int i=1;i<=n;i++)p[i]=p[i-1]+a[i];
this is the only info we care about (for now).
2.get range sum in $$$O(1)$$$
as we want the sum in range $$$[L,R]$$$
we realize that it is the same as $$$p(R)-p(L-1)$$$
a(L)+a(L+1)+...+a(R)=(a(1)+a(2)+...+a(L-1)+a(L)...+a(R))-(a(1)+a(2)+...+a(L-1)) = p(R)-p(L-1)
3. find the number of sub arrays with sum equal to $$$x$$$
given the array $$$a$$$ of size $$$n$$$
every sub array sum is represented by two prefix sums $$$p(L-1),p(R)$$$
first we initialize a map
map<int,int> cnt;
and loop through the array
lets say we are considering index $$$i$$$
at this point every prefix sum $$$p(j)$$$ such that $$$j<i$$$ is considered in the map
it is necessary to consider the empty prefix at first
cnt[0]=1;
and then at each index $$$i$$$
we do *something to get all sub arrays ending at index $$$i$$$ and have the sum equal to $$$x$$$ count and add it to the final answer
and then do
cnt[p[i]]++;
what to *do each time?
well we want the following equation $$$ p(i)-V=x $$$ when V is the value that we are searching
then $$$V=p(i)-x$$$
so we simply add $$$cnt[p[i]-x]$$$ to the final answer each time
note that $$$V=p(L-1)$$$ that's why we considered the empty prefix at the first place
there is a problem at CSES mentioning this trick
4. even more prefix sums
here i will mention other prefix stuff that might be useful to use later,
1.prefix xor
we can create an array containing the xor of each prefix, this way we can get the xor of any range
for(int i=1;i<=n;i++)p[i]=p[i-1]^a[i];
//xor of a range [L,R] is p[R]^p[L-1]
this is possible because
X^X=0 , A^B=B^A , X^0=X
2.prefix $$$mul$$$,$$$and$$$ , $$$or$$$ , $$$max$$$ , $$$min$$$ , $$$gcd$$$ and rarely $$$lcm$$$
these queries are sadly not usual to get in a range as we will need advanced data structures (aka sparse table) to get them in $$$O(1)$$$ (or $$$log(n)$$$ in case of gcd,lcm)
but you can get them in $$$log(n)$$$ complexity by using segment trees (out of topic but worth mentioning)
if we want $$$mul$$$ in range we wouldd need to make sure a lot of things are satisfied , like there should not be zero (as it will make all prefixes ahead zeros and we cant cancel them),and overflow cannot happen (which could mean a lot of things), in our case we will focus on $$$mul$$$ in range with $$$mod$$$ so we can get inverse mod in $$$O(log(n))$$$,or even $$$O(1)$$$ if the mod is small enough
it is also important to mention that we can get $$$(and,or)$$$ queries in range using simple prefix sum
by applying prefix sum on each bit as follows:
for(int j=0;j<30;j++){
for(int i=1;i<=n;i++){
p[j][i]=p[j][i-1]+((a[i]>>j)&1);
}
}
and then we can get $$$or$$$ or $$$and$$$ queries in a range $$$[L,R]$$$ as follows:
int AND(int L,int R){
int res=0;
for(int j=0;j<30;j++){
if(p[j][R]-p[j][L-1]==R-L+1)res+=1<<j;
}
return res;
}
int OR(int L,int R){
int res=0;
for(int j=0;j<30;j++){
if(p[j][R]-p[j][L-1]>0)res+=1<<j;
}
return res;
}
but in certain problems it is enough to get them in a prefix (or a suffix maybe?)
3. 2D prefix sum
the idea is we want to get sum in a rectangle in $$$O(1)$$$
given the 2D array $$$a$$$ of $$$n$$$ rows and $$$m$$$ columns
a(1,1) a(1,2) ... a(1,m)
a(2,1) a(2,2) ... a(2,m)
...
a(n,1) a(n,2) ... a(n,m)
we do a prefix sum on each row of $$$a$$$ first so that
p(i,j)=a(i,1)+a(i,2)+...+a(i,j)
and then we do another prefix sum on columns of $$$p$$$ so that
s(i,j)=p(1,j)+p(2,j)+...+p(i,j)
actually we don't need two more arrays for that, we can simply apply these operations on the same array twice, or one more array just to keep the original array.
the code is as follows:
for(int i=1;i<=n;i++)
for(int j=1;j<=m;j++)p[i][j]=a[i][j]+p[i][j-1];
for(int j=1;j<=m;j++)
for(int i=1;i<=n;i++)p[i][j]+=p[i-1][j];
we can get sum of elements in a rectangle with upper left corner at $$$(i,j)$$$ and with $$$x$$$ rows and $$$y$$$ columns like this
p[i+x-1][j+y-1]-p[i+x-1][j-1]-p[i-1][j+y-1]+p[i-1][j-1]
using inclusion exclusion principle (which is out of topic right now)
but i can explain why this is true
let the array $$$f$$$ represent how may times we added an element from the array $$$a$$$
first each $$$f(i,j)=0$$$
i will add each component to see how the frequency of elements change
notice that adding $$$p(i,j)$$$ represents increasing the value of $$$f(i1,j1)$$$ $$$i1<=i&&j1<=j$$$ by one
i=2,j=2,x=4,y=3
+p[i+x-1][j+y-1] -p[i+x-1][j-1] -p[i-1][j+y-1] +p[i-1][j-1]
0 0 0 0 0 1 1 1 1 0 0 1 1 1 0 -1 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 1 1 1 0 0 1 1 1 0 0 1 1 1 0 0 1 1 1 0
0 0 0 0 0 1 1 1 1 0 0 1 1 1 0 0 1 1 1 0 0 1 1 1 0
0 0 0 0 0 1 1 1 1 0 0 1 1 1 0 0 1 1 1 0 0 1 1 1 0
0 0 0 0 0 1 1 1 1 0 0 1 1 1 0 0 1 1 1 0 0 1 1 1 0
4.sweep line
the idea is simple
given an array of zeros $$$a$$$
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
you can add $$$V$$$ to a range $$$[L,R]$$$ by adding $$$V$$$ to $$$a(L)$$$ and subtracting $$$V$$$ from $$$a(R+1)$$$
0 0 0 V 0 0 0 0 0 -V 0 0 0 0 0
and then applying prefix sum on the array
0 0 0 V V V V V V 0 0 0 0 0 0 0
well that is not useful when talking about one query , the thing is you can add multiple values to multiple ranges and then apply prefix sum , this is considered offline approach in multiple cases.
5. sweep line on a 2D array
adding V to rectangle with upper left corner at $$$(i,j)$$$ and with $$$x$$$ rows and $$$y$$$ columns
add V to both $$$(i,j)$$$ $$$(i+x,j+y)$$$
subtract V from both $$$(i+x,j)$$$ $$$(i,j+y)$$$
and then apply prefix sum 2D like we learned
i=2,j=2,x=2,y=3
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 V 0 0 -V 0 V V V 0
0 0 0 0 0 0 0 0 0 0 0 V V V 0
0 0 0 0 0 0 -V 0 0 V 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
you can actually extend this to 3D array (worth mentioning).