On today's POI training camp round I have learnt a nice technique that could possibly be useful in some number theory problems. I couldn't find any CF article on it, so I think it's fair enough to share it on my own.
Remark on used notation
In some sums I will use an Iverson notation
Problem: Squarefree Function
Let's define a Squarefree Function $$$f(x)$$$ for any positive integer $$$x$$$ as $$$x$$$ divided by a greatest perfect square, that divides $$$x$$$.
For example: $$$f(1) = 1$$$, $$$f(2) = 2$$$, $$$f(4) = 1$$$, $$$f(6) = 6$$$, $$$f(27) = 3$$$, $$$f(54) = 6$$$, $$$f(800) = 2$$$
Given an array $$$a$$$ of $$$n \leq 10^5$$$ positive integers, where each $$$a_i \leq 10^5$$$ compute sum
\begin{gather} \sum\limits_{1 \leq i,j \leq n}f(a_i\cdot a_j) \mod (10^9 + 7) \end{gather}
Technique: GCD Convolution
You might probably heard about a Sum Convolution. For two arrays $$$b$$$, and $$$c$$$, it is defined as an array $$$d$$$ such that \begin{gather} d_k = \sum\limits_{i + j = k}b_i\cdot c_j \end{gather} If not, it's basically the same thing as a polynomial multiplication. If $$$B(x) = b_0 + b_1x + b_2x^2 + ... + b_nx^n$$$, and $$$C(x) = c_0 + c_1x + c_2x^2 + ... + c_nx^n$$$, then $$$(B \cdot C)(x) = d_0 + d_1x + d_2x^2 + ... + d_{2n}x^{2n}$$$
Let's define GCD Convolution by analogy
Definition
A GCD Convolution of two arrays $$$b$$$, and $$$c$$$, consisting of positive integers, is an array $$$d$$$ such that \begin{gather} d_k = \sum\limits_{gcd(i,j) = k}b_i\cdot c_j \end{gather}
Algorithm to find GCD Convolution
Of course, we can compute it naively by iterating over all pairs of indicies. If $$$b$$$ and $$$c$$$ consists of $$$n$$$ elements then the overall complexity would be $$$O(n^2log(max(b) + max(c)))$$$. But it turns out, that we can do better.
Let's look at the sum of $$$d_k$$$ values, with indicies divisible by some integer $$$g$$$, so that $$$k = gm$$$ is satisfied for some integer m. \begin{gather} \sum\limits_{m=1}^{n/g}d_{gm} = \sum\limits_{m=1}^{n/g}\sum\limits_{gcd(i,j) = gm}b_i\cdot c_j = \sum\limits_{g | gcd(i,j)}b_i\cdot c_j \end{gather}
From the definition of gcd, we know that $$$g | gcd(i,j) \Leftrightarrow g | i \wedge g | j$$$ \begin{gather} \sum\limits_{g | gcd(i,j)}b_i\cdot c_j = \sum\limits_{i,j}b_i\cdot c_j[g \;|\; gcd(i,j)] = \sum\limits_{i,j}b_i\cdot c_j[g \;|\; i][g \;|\; j] = \end{gather} \begin{gather} =\sum\limits_{i,j}\left(b_i[g \;|\; i]\right)\cdot \left(c_j[g \;|\; j]\right) = \left(\sum\limits_{g|i}b_i\right)\left(\sum\limits_{g|j}c_j\right) \end{gather}
We can define $$$B_g = \sum_{m=1}^{n/g}b_{gm}$$$, and $$$C_g = \sum_{m=1}^{n/g}c_{gm}$$$, and $$$D_g = \sum_{m=1}^{n/g}d_{gm}$$$. From above equation one could easily derive $$$D_g = B_g\cdot C_g$$$. Knowing that $$$O(n + \frac{n}{2} + \frac{n}{3} + ...) = O(n\log n)$$$, arrays $$$B$$$ and $$$C$$$ can be computed directly from their definitions in $$$O(n\log n)$$$.
Recovering a $$$d_k$$$ values from D array is simple. All we need is just subtract all the summands of $$$D_i$$$ except for the smallest. So, formally, we have \begin{gather} d_k = D_k - \sum\limits_{m=2}^{n/k}d_{km} \end{gather} Which can be computed using dynamic programming, starting from $$$k = n$$$.
So, the overall complexity of computing a GCD Convolution of two arrays of size $$$n$$$ is $$$O(n\log n)$$$.
Back to original problem
We can see, that \begin{gather} f(a_i\cdot a_j) = \frac{f(a_i)\cdot f(a_j)}{gcd(f(a_i), f(a_j))^2} \end{gather}
So, having an array $$$w_{f(a_i)} = \sum\limits_if(a_i)$$$ all we need is just to compute a GCD Convolution of $$$w$$$ with itself. Let's denote that convolution by $$$d$$$. Then, by definition \begin{gather} \sum\limits_{i,j :\;gcd(f(a_i), f(a_j)) = k} \frac{f(a_i)\cdot f(a_j)}{gcd(f(a_i), f(a_j))^2} = \frac{d_k}{k^2} \end{gather}
So answer to our problem is just a sum \begin{gather} \sum\limits_{k=1}^{max(f(a_i))}\frac{d_k}{k^2} \end{gather}
Assuming that we have computed $$$f(a_i)$$$ values with sieve, if we denote $$$A = max(a_i)$$$, then overall complexity of this solution is $$$O(n + A\log A)$$$
Practice problems
Actually, I don't have any. I will be glad if you share some problems in comments. All I have is just this:
Auto comment: topic has been updated by szaranczuk (previous revision, new revision, compare).
Nice! You can optimize it to $$$O(n \log \log n)$$$ using dp sos.
$$$f(x) \leq x$$$, hence including $$$FlogF$$$ in final complexity is quite redundant.
thanks for comment, I've downvoted accidentally ofc
Auto comment: topic has been updated by szaranczuk (previous revision, new revision, compare).
I also find this technique very interesting and was surprised that it is not well known. Regarding practice problems: you can solve all of the example problems from: https://codeforces.net/blog/entry/53925 and the problem http://poj.org/problem?id=3904 very easily with the gcd-convolution. Also you can submit problem from the blog here: https://codeforces.net/gym/103688/problem/E (only difference is that the sum is over $$$i<j$$$)
I think the convolution share the same idea with FWT(using High-dimensional prefix sum to optimize)
That's you can regard the prime number as the digit in the high-dimensional, and $$$gcd(p^x, p^y)=p^{\min(x,y)}$$$ then it's the prefix sum.Anyway, it's a nice expansion.
This kind of convolution is very closely related to Mobius inversion on posets. More specifically, in this case, you look at the zeta transform of the array under the poset formed by the indices according to the divisor lattice. You can generalize this technique to other lattices as well.
Problem suggestion: https://www.codechef.com/CDUN2022/problems/YETGCD
The intended solution uses ETF and some clever rearrangement of equations, but nor found a more straightforward solution using gcd convolution + sieve.
Here's a sketch of how you can use gcd convolution here:
Let $$$f_i = \sum_{j = 1}^i \gcd(i, j)$$$ for $$$1 \le i \le n$$$. $$$f$$$ is multiplicative, so you can compute it using a sieve fast enough. Let $$$g$$$ be the gcd convolution of $$$f$$$ with itself. Then the answer is $$$\sum_{i = 1}^n i \cdot g_i$$$.
Similar problem: https://atcoder.jp/contests/agc038/tasks/agc038_c
this isn't correct solution, but can solve today's div2D.