Subset convolution interpretation

Revision en1, by adamant, 2021-06-24 14:57:47

Hi everyone!

Recently aryanc403 brought up a topic of subset convolution and some operations related to it.

This inspired me to write this blog entry as existing explanations on how it works seemed unintuitive for me. I believe that having viable interpretations of how things work is of extreme importance as it greatly simplifies understanding and allows us to reproduce some results without lust learning them by heart.

Also this approach allows us to generalize subset convolution to sum over $$$i \cup j = k$$$ and $$$|i \cap j|=l$$$, while in competitive programming we usually only do it for $$$|i \cap j|=0$$$. Enjoy the reading!

Subset convolution $$$a \circ b$$$ is defined as follows, let $$$a=(a_0, ..., a_{2^n-1})$$$ and $$$b=(b_0, ..., b_{2^n-1})$$$. Then

\begin{equation} (a \star b)_k = \sum\limits_{i\cup k=k} a_i b_{k \setminus i} \end{equation}

That is, $$$k$$$-th term of this convolution is a sum over all disjoint partitions of $$$k$$$ into two submasks.

Generic approach to compute such kinds of sums is to make a transform which maps initial arrays into some space where convolution is done via pointwise (Hadamard) product $$$(a \circ b)_k = a_k b_k$$$. That is, to find some invertible transform $$$F$$$ such that

\begin{equation} F(a \star b) = F(a) \circ F(b) \end{equation}

These kinds of convolutions can be interpreted as a product in groupoid ring. Let $$$x$$$ and $$$y$$$ be integers from $$$0$$$ to $$$2^n-1$$$. We define

\begin{equation} x \star y = \begin{cases} x \cup y,&\text{if x $$$\cap$$$ y=0},\newline 0,&\text{otherwise} \end{cases} \end{equation}

Now if we consider formal sums $$$a=\sum\limits_{g=0}^{2^n-1} a_g g$$$ and $$$b=\sum\limits_{g=0}^{2^n-1} b_g g$$$, their product over groupoid ring is defined as

\begin{equation} a \star b = \sum\limits_{i=0}^{2^n-1} \sum\limits_{j=0}^{2^n-1} a_i b_j (i \star j) \end{equation}

What would be extremely nice for us here is to introduce formal variable $$$x$$$ such that $$$x^{i}x^j=x^{i \star j}$$$. Then these convolutions can be interpreted as polynomial products where polynomials are taken over these weird variable. To compute this product, a common trick is to instead compute it in some specific values of $$$x$$$, do the component-wise product and interpolate it back to polynomial to find the answer.

And since we work with a bitwise operation, we will use $$$x^i=x_1^{i_1} x_2^{i_2} \dots x_n^{i_n}$$$, where $$$i_k$$$ is $$$0$$$ or $$$1$$$ depending on the $$$k$$$-th bit of $$$x^i$$$. Then

\begin{equation} x^i x^j = (x_1^{i_1} x_2^{i_2} \dots x_n^{i_n}) (x_1^{j_1} x_2^{j_2} \dots x_n^{j_n}) = x_1^{i_1 \star j_1} x_2^{i_2 \star j_2} \dots x_n^{i_n \star j_n} = x^{i \star j} \end{equation}

Now before working with subset convolution, let's take a step back and recall what we're doing in "or" and "xor" convolutions. For "or" convolution we would need $$$x^0 x^0 = 0$$$, $$$x^0 x^1 = x^1$$$ and $$$x^1 x^1 = x^1$$$. What are the actual numbers we can think of for which such identities hold? There are only two such numbers, they're $$$x=0$$$ and $$$x=1$$$. So, to compute the "or" convolution is equivalent to compute the values of the multivariate polynomial in all the points of {$$$0, 1$$$}$$$^n$$$ boolean cube.

And what about "xor" convolution? For it we need $$$x^0 x^0 = 0$$$, $$$x^0 x^1 = x^1$$$ and $$$x^1 x^1 = x^0$$$. What are numbers for which these identities hold? It is $$$x=-1$$$ and $$$x=1$$$. So, the "xor" convolution can be interpreted through computing values in {$$$-1, 1$$$}$$$^n$$$ cube.

Now comes the "subset" convolution. We need some value such that $$$x^0 x^0 = x^0$$$, $$$x^0 x^1 = x^1$$$ and $$$x^1 x^1 = 0$$$. Woah, there is only one such number, $$$x=0$$$. One value is not enough for us to uniquely recover the polynomial. Now, what do we do if we desperately need an element with specific properties but we do not have one?

Correct, we formally introduce it and pretend that it actually exists! Remember $$$i$$$ which was introduced with $$$i^2 = -1$$$. Now here we formally introduce element $$$\varepsilon$$$ such that $$$\varepsilon^2=0$$$ but $$$\varepsilon \neq 0$$$ and treat all elements of extended ring as $$$a+b\varepsilon$$$ with corresponding addition and multiplication. What was just introduced here is the concept of dual numbers.

It would almost work as it is possible to compute the values of a polynomial in {$$$0,\varepsilon$$$}$$$^n$$$. The problem begins when we need to inverse the transform as it requires division by $$$\varepsilon$$$ which is not doable as it doesn't have an inverse. Here comes another trick. What do we do if we desperately need to compute something modulo prime number $$$p$$$, but in the process it can happen that we may need to divide by $$$p$$$? What if we at the same time certainly know that such division will occur at most once?

We calculate everything modulo $$$p^2$$$ and when the division is needed, we just divide it directly! And if we need to divide at most $$$k$$$ times, we may compute everything modulo $$$p^{k+1}$$$. What we know here is that while computing values in $$$\varepsilon$$$, we will multiply by it every time there is a bit in the number, same goes for dividing by $$$\varepsilon$$$ during the inverse calculations: we store elements of kind $$$a_0 + a_1 \varepsilon + a_2 \varepsilon^2 + \dots + a_k \varepsilon^k$$$ to make sure that when time comes we are able to divide them by $$$\varepsilon$$$ at most $$$k$$$ times (in our case, $$$k$$$ is the number of bits we use, that is $$$n$$$).

That being said, subset convolution also has a nice intuitive interpretation which is nothing more than computing the values of multivariate polynomial in {$$$0, \varepsilon$$$}$$$^n$$$ cube in the ring of polynomials over $$$\varepsilon$$$ as a formal variable.

And, as a sweet bonus, after the inverse convolution we will actually obtain some polynomials over $$$\varepsilon$$$ instead of simply numbers. Their "free" terms will correspond to $$$(a \star b)_k$$$, but what's the meaning of other terms? As it turns out, \begin{equation} [\varepsilon^l](a \star b)_k = \sum\limits_{i \cup j = k,\newline |i\cap j|=l} a_i b_j \end{equation} So after the inverse transform we obtain grouping not only by $$$i \cup j$$$, but also by the size of $$$i \cap j$$$. And it in fact makes much sense, as if $$$x_1, \dots, x_n$$$ are all either $$$0$$$ or $$$\varepsilon$$$ then for the product of two monomials holds:

\begin{equation} (x_1^{i_1} x_2^{i_2} \dots x_n^{i_n}) (x_{1}^{j_1} x_{2}^{j_2} \dots x_{n}^{j_n}) = \varepsilon^{(i_1 \wedge j_1) + \dots + (i_n \wedge j_n)}x_1^{i_1 \vee j_1} x_2^{i_2 \vee j_2} \dots x_n^{i_n \vee j_n} = \varepsilon^{|i \cap j|} x^{i \cup j} \end{equation}

Because if $$$i_k=j_k=1$$$ for $$$x_k=0$$$, the product will be $$$0$$$ and the exponent of $$$\varepsilon$$$ doesn't matter, otherwise it will both go to $$$i \cup j$$$ and $$$i \cap j$$$ which corresponds to $$$\varepsilon^2$$$ which we obtain from multiplying $$$x_k^{i_k} x_k^{j_k}$$$. In this way, as promised earlier, we make a convolution over both $$$i \cup j = k$$$ and $$$|i \cap j| = l$$$ and doing this we somewhat justify the additional $$$n$$$ factor in the complexity.

Tags tutorial, convolution, polynomial interpolation, chinese remainder theo., crt

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en22 English adamant 2021-06-27 20:18:03 262 omg, repetitive
en21 English adamant 2021-06-27 16:00:46 21 clarification
en20 English adamant 2021-06-27 15:58:42 6 subset convolution is star, not circ
en19 English adamant 2021-06-27 15:56:57 774 specific examples
en18 English adamant 2021-06-27 15:46:18 1748 algebraic approach
en17 English adamant 2021-06-25 21:59:04 605 also analogue of WHT
en16 English adamant 2021-06-25 12:09:07 182 update with light out sol
en15 English adamant 2021-06-25 04:46:51 33 non-empty sets here
en14 English adamant 2021-06-25 04:04:38 935 clarification
en13 English adamant 2021-06-25 02:02:55 43 a link to the article about poly exponent
en12 English adamant 2021-06-25 02:00:47 948 adding paragraph about computing functions in values
en11 English adamant 2021-06-24 18:21:32 4 math mode
en10 English adamant 2021-06-24 17:47:28 127 fix
en9 English adamant 2021-06-24 17:39:53 2 typo...
en8 English adamant 2021-06-24 17:25:49 2 x^0 x^0 = x^0, not 0
en7 English adamant 2021-06-24 17:24:53 2 typo
en6 English adamant 2021-06-24 15:37:43 2 update
en5 English adamant 2021-06-24 15:35:41 100 update
en4 English adamant 2021-06-24 15:27:16 9 fix
en3 English adamant 2021-06-24 15:26:47 378 cleanup
en2 English adamant 2021-06-24 14:59:20 23
en1 English adamant 2021-06-24 14:57:47 7157 Initial revision (published)