Hi everyone!
Recently I've published a blog about how one can multiply and divide sequences with Dirichlet convolution. In this blog, we will learn about a convenient framework to reason about them in a "coordinate-free" notation, similar to how generating functions are used to analyze sequences under the regular convolution.
- Part 1: Fast prefix sum computation
- Part 2: Dirichlet series and prime counting
We will learn how to deal with Dirichlet multiplication and division in the framework of Dirichlet series, and derive a number of well known number theoretic results from this perspective. While doing so, we will learn about Riemann zeta function and will have a glimpse into why it is so important in analyzing behavior of prime numbers.
We will also learn how Dirichlet series framework helps us to, given $$$g(1), \dots, g(n)$$$, to compute $$$f(1), \dots, f(n)$$$ in $$$O(n \log n)$$$ such that $$$g(n)$$$ is the Dirichlet product of $$$f(n)$$$ with itself, repeated $$$k$$$ times. Besides that, we will learn how to count prime numbers below $$$n$$$ in $$$O(n^{2/3})$$$ using logarithms on Dirichlet series.
Convolutions recap
Let $$$a_1, a_2, \dots$$$ and $$$b_1, b_2, \dots$$$ be two sequences. We define their Dirichlet convolution as
Before working with it, let's recall what we did with other convolutions. Assume, more generically, that we want to analyze
What we typically do to work with such convolutions is we introduce a family of functions $$$\rho_i$$$ such that
With these objects in mind, we can rewrite the expression above as
Note that we vaguely described $$$\rho_i$$$ as "functions" without going into detail about their type and/or domain. This is somewhat intentional, as specific applications may imply different specifications. Consider some standard examples (essentially taken from here).
Dirichlet series
Now, what should we use for $$$i \circ j = ij$$$, where $$$ij$$$ means simple multiplication? We already noticed that multiplying two exponents with the same base yields to summation in powers as $$$x^i x^j = x^{i+j}$$$. But what happens if we multiply two exponents with the same power? We get multiplication in bases, that is $$$x^s y^s = (xy)^s$$$. Thus, we define
where $$$s$$$ is a positive integer greater than $$$1$$$. Using such functions, we define the Dirichlet series of a sequence $$$a_1, a_2, \dots$$$ as
This allows us to rewrite Dirichlet convolution as
Note: We use $$$i^{-s}$$$ instead of $$$\rho_i(s) = i^s$$$ by convention, and so that Dirichlet series converges for integer $$$s > 1$$$.
With formal power series, it is common to write $$$[x^k] F(x)$$$ to denote the coefficient near $$$x^k$$$ of the expansion of $$$F(x)$$$. For notational convenience, we will also write $$$[n^{-s}] F(s)$$$ to denote the coefficient near $$$n^{-s}$$$ of the Dirichlet series.
Dirichlet power
Consider the following problem from this blog by jqdai0815:
102471C - Dirichlet $k$-th root. Given $$$g(1), \dots, g(n)$$$ and $$$k$$$, find $$$f(1), \dots, f(n)$$$ such that $$$\overbrace{(f*\dots*f)}^{k}(n) = g(n)$$$.
Solution. When talking about quickly raising functions to arbitrary powers, we generally have two options to proceed.
First option is to represent the power as $$$F^k(s) = \exp [k \log F(s)]$$$. Another option is to take the derivative and represent $$$G = F^k$$$ as
The later is then used to represent the coefficients of $$$Q(s)$$$ as a reasonably small recurrence. For Dirichlet series,
therefore the derivative of the full Dirichlet series rewrites as
Expanding both sides of $$$G' F = k G F'$$$ into Dirichlet product, we get
Since individual terms of Dirichlet convolution involve much less summands than regular convolution (and the number of summands amortizes to $$$O(n \log n)$$$ over all indices), we could use the identity above to find $$$f(n)$$$ from the values of $$$g(n)$$$ and smaller values of $$$f(n)$$$. Except for, we need to print exact answer modulo prime number... What do we do about it?
Instead of $$$\ln n$$$, we need to use some function that behaves reasonably similar to $$$\ln n$$$ modulo prime numbers. But what kind of function could it be? What properties of $$$\ln n$$$ do we actually rely on in this recurrence? Well, first thing to notice is that we don't really need $$$e$$$ to be the base of the logarithm, as we may multiply both sides by a constant to substitute it with some other base.
But that still won't do it. Instead, we should substitute the derivative with some other operation $$$D[\cdot]$$$, such that
for any positive integer number $$$k$$$, and $$$D[F(s)]$$$ is something with a simple form. By induction, any operation $$$D[\cdot]$$$ satisfying Leibniz rule
will do. This is, in fact, how derivatives are defined in abstract algebra (see also closely related arithmetic derivative).
Since standard derivative multiplied $$$f(n)$$$ with $$$-\ln n$$$, it's reasonable to assume that the function we're looking for also applies to Dirichlet series component-wise, multiplying each $$$f(n)$$$ with another function $$$h(n)$$$. Then, the product rule rewrites as
From this it follows that $$$h(n)$$$ could be any function such that $$$h(ab) = h(a) + h(b)$$$ for any positive integer $$$a, b$$$. This is it, this is the most important property of $$$\ln n$$$ in our context that we were looking for! Such functions are known as totally additive functions, and one natural example of it could be the prime omega function $$$\Omega(n)$$$, which is the number of prime divisors of $$$n$$$, counted with multiplicities.
Multiplicative functions
In number theory, first and most natural example of where Dirichlet series pop up are multiplicative functions. Let's see why.
A multiplicative function is a function $$$f : \mathbb Z_+ \mapsto \mathbb C$$$ such that $$$f(ab) = f(a) f(b)$$$ for any $$$a,b$$$ such that $$$\gcd(a,b)=1$$$.
From fundamental theorem of arithmetic it follows that multiplicative functions are fully defined by their values in powers of primes:
Therefore, when working with multiplicative functions, it is often convenient, for each prime number $$$p$$$ to consider the sequence
and sometimes even its ordinary generating function (also known as the Bell series):
as the Dirichlet product over just powers of primes corresponds to the regular product of such generating functions:
This connection re-expresses itself even more when treated through Dirichlet series framework. Let's, similarly to generating functions above, define the $$$p$$$-component of the multiplicative function's Dirichlet series $$$F_p(s)$$$ as
Using the notation of $$$p$$$-components, we can then find out that
In other words, the full Dirichlet series can be represented as an infinite product over all primes (so-called Euler product):
From this, for example, immediately follows that the Dirichlet convolution of multiplicative functions is a multiplicative function, as
This result is quite important, as it allows us to analyze Dirichlet series of multiplicative functions as infinite sums or infinite products of good-looking functions, depending on what properties are of higher interest to us.
Examples and their Dirichlet series
In this section, we consider some well-known examples of multiplicative functions (taken mostly from here) and how their Dirichlet series are computed. As the main purpose of this section is to get a grasp of various perspectives used to work with Dirichlet series, we will consider several possible derivations for each function.
The constant function $$$I(n)=1$$$. Since $$$I(p^k) = 1$$$, its $$$p$$$-components look like
Therefore, its Dirichlet series can be represented as
This particular Dirichlet series is also known as the Riemann zeta function. From the result above it follows that multiplying the Dirichlet series of a multiplicative function $$$f(n)$$$ with the Riemann zeta function, results into a multiplicative function $$$g(n)$$$ such that
The Möbius function. In number theory, one often wants to inverse such transform, that is, to find $$$f(n)$$$ from a known $$$g(n)$$$. As we already noticed above, $$$G(s) = \zeta(s) F(s)$$$, thus we can express $$$F(s)$$$ as $$$F(s) = \zeta^{-1}(s)G(s)$$$. If you're familiar with Möbius inversion, you already know what happens next. We need to find the function $$$\zeta^{-1}(s)$$$ explicitly, and we can do so as
If we fully expand it, we get
In the identity above, we introduced the function $$$\mu(n)$$$ such that
The function $$$\mu(n)$$$ is also known as the Möbius function. It is a multiplicative function, that can be defined in powers of primes as $$$\mu(p^k) = [k=0] - [k=1]$$$, where $$$[\cdot]$$$ is the Iverson bracket, which evaluates to $$$1$$$ if the expression inside is true and to $$$0$$$ otherwise.
From this, we also get the Möbius inversion formula:
The power function $$$f(n) = n^a$$$. Since $$$f(p^k) = p^{ka}$$$, its $$$p$$$-components look like
Thus, its Dirichlet series can be represented as
Of course, the same result can also be obtained directly as
The divisor function $$$\sigma_a(n) = \sum\limits_{d|n} d^a$$$. Since $$$\sigma_a(p^k)=\sum\limits_{i \leq k} p^{ia}$$$, its $$$p$$$-components look like
This, in turn, means that its Dirichlet series can be represented as
Another way to get the same result directly is as follows:
And yet another way to get it is to express $$$\sigma_a(n)$$$ as the Dirichlet convolution of $$$f(n) = 1$$$ and $$$g(n) = n^a$$$. The first one has Dirichlet series $$$\zeta(s)$$$, and the second one has Dirichlet series $$$\zeta(s-a)$$$.
The Euler's totient function $$$\varphi(n)$$$. Since $$$\varphi(p^k) = p^k - p^{k-1}$$$, its $$$p$$$-components look like
Multiplying it over all $$$p$$$, we get
Another way to derive it is to express $$$g(n) = n$$$ as the Dirichlet convolution of $$$\varphi(n)$$$ and $$$f(n)=1$$$:
as $$$\varphi\left(\frac{n}{d}\right)$$$ is the amount of numbers $$$k$$$ from $$$1$$$ to $$$n$$$ such that $$$\gcd(k, n) = d$$$. But the Dirichlet series of $$$g(n)$$$ is known to be $$$\zeta(s-1)$$$ and for $$$f(n)$$$ it's $$$\zeta(s)$$$, so this Dirichlet convolution, indeed, rewrites in terms of Dirichlet series as $$$\zeta(s) F(s) = \zeta(s-1)$$$.
Square-free indicator. Let $$$f(n)$$$ a function equals to $$$1$$$ if $$$n$$$ is square-free and to $$$0$$$ otherwise. Its $$$p$$$-components are
Thus, the Dirichlet series of $$$f(n)$$$ is found as
General power function. If $$$F(s)$$$ is the Dirichlet series of $$$f(n)$$$, then $$$F(s-a)$$$ is the Dirichlet series of $$$f(n) n^a$$$.
Totally multiplicative functions
A totally multiplicative function is a function $$$f : \mathbb Z_+ \to \mathbb C$$$, such that $$$f(a) f(b) = f(ab)$$$ for any $$$a, b$$$.
For a totally multiplicative functions, its Dirichlet series is even more restricted, as the function is fully determined by its values in primes:
and its $$$p$$$-components must look like
thus the Dirichlet series of a totally multiplicative function can be represented as
Noteworthy, expanding the inverse of such function gives us
In other words, the inverse $$$g(n)$$$ of a totally multiplicative function is $$$g(n) = \mu(n) f(n)$$$. In particular, for power function $$$f(n)=n^a$$$, which is totally multiplicative, it means that the Dirichlet series of its inverse is
Another example of a totally multiplicative function, which often occurs in a competitive programming context, is the Legendre symbol:
which can as well be defined explicitly as
It is totally multiplicative in its top argument, as
$$$\exp$$$, $$$\log$$$ and prime-counting
Library Judge — Counting Primes. Count the number of primes no more than $$$n \leq 10^{11}$$$.
This approach was originally shared by ecnerwala in this comment.
We can apply logarithms and exponents to Dirichlet series, as defined by their series representations:
Applying the logarithm series definition to $$$\zeta(s)$$$, we get
In other words,
That being said, if we compute prefix sums of a function with Dirichlet series $$$\log \zeta(s)$$$ up to $$$n$$$, we can then subtract corresponding counts over $$$p^2, p^3, \dots$$$ to get the prime-counting function $$$\pi(n)$$$. But how do we find the prefix sums of the logarithm of $$$\zeta(s)$$$?
We can use another series representation of the logarithm as
and plug in $$$\zeta(s)-1$$$ instead of $$$x$$$. Note that when $$$f(1)=0$$$, the values of $$$F(s)^k$$$ have a quickly vanishing prefix, as
Therefore, the prefix of $$$(\zeta(s)-1)^k$$$ consists entirely of zeros for $$$n < 2^k$$$. Thus, we're only interested in $$$F^k(s)$$$ for $$$k \leq \log_2 n$$$. To compute these powers, we can use the approach shared by ecnerwala in the comment to my previous blog about Dirichlet convolutions.
We can split the space of $$$xyz \leq n$$$ into $$$O(n^{2/3})$$$ rectanges $$$[x_0, x_1] \times [y_0, y_1] \times [z_0, z_1]$$$ and use such splitting to compute prefix sums over Dirichlet series $$$H(s) = F(s) G(s)$$$ or $$$F(s) = H(s) / G(s)$$$, assuming that the prefix sums of the series on the right-hand side are known. Doing it as is would take $$$O(n^{2/3} \log n)$$$, but we can speed up the whole computation. Let $$$t \approx n^{1/6}$$$, then
has an empty prefix of length $$$t$$$, and we will only need first $$$\small \frac{\log n}{\log t}$$$ values of $$$(\zeta_t(s)-1)^k$$$ instead of $$$\log n$$$ to compute the prefix sums of $$$\log \zeta_t(s)$$$ up to $$$n$$$. To find $$$\zeta_t(s)$$$ itself, we should use the fact that multiplying with $$$1-p^{-s}$$$ can be done in $$$O(\sqrt n)$$$, as after such multiplication $$$F(\lfloor n/k \rfloor)$$$ simply gets reduced by $$$F(\lfloor n / (kp) \rfloor)$$$.
After this, we can add the contributions of first $$$t$$$ terms back into logarithm to recover $$$\log \zeta(s)$$$ from $$$\log \zeta_t(s)$$$.
Note 1: As ecnerwala mentioned, this approach could more generally be used to sum up any totally multiplicative function over all primes up to $$$n$$$. Indeed, the logarithm of the Dirichlet series of a totally multiplicative function is
Thus, as we see,
Note 2: The approach above can also be used to compute prefix sums of logarithms and exponents of any Dirichlet series. Generally, we can represent any Dirichlet series as an infinite product over not necessarily prime numbers as
In his comment, ecnerwala calls $$$g(n)$$$ the pseudo-Euler transform of $$$f(n)$$$, in reference to Euler transform of integer sequences. Similarly to what we did with primes, we can find $$$f(1), \dots, f(t)$$$ for $$$t \approx n^{1/6}$$$ and divide $$$G(s)$$$ by corresponding fractions to get
so that we can find prefix sums of $$$G_t(s)$$$ in $$$O(n^{2/3})$$$, and then amend them with logarithms of fractions to get $$$\log G(s)$$$. To find $$$G_t(s)$$$ itself we, again, shout notice that, again, multiplying $$$G(s)$$$ with $$$1-f(t) t^{-s}$$$ can be done in $$$O(\sqrt n)$$$, as after such multiplication, the value $$$F(\lfloor n/k \rfloor)$$$ gets reduced by $$$f(t) F(\lfloor n / (kt) \rfloor)$$$.
Note 3: The logarithm $$$\log \zeta(s)$$$ connects the Riemann zeta function with the prime zeta function
This connection, in turn, allows to express the prime zeta function in terms of Riemann zeta function through generalized Möbius inversion:
See also
- Counting primes in $$$\tilde O(n^{2/3})$$$ — also the comment by ecnerwala on the approach through Dirichlet series.
- Lucy's Algorithm + Fenwick Trees — another approach to count primes up to $$$n$$$ in $$$O(n^{2/3} \log^{1/3} n)$$$.
Waiting for cum00 komment
Touch grass
Thanks to this blog I realized that https://codeforces.net/blog/entry/53925 is just the inverse of the dirichlet convolution with the identity function, thank you.
Thanks for the blog! As a side-note, the name of the zeta function is a reference to the inverse of the Mobius function in the framework of Mobius inversion on posets (whose special case is the number theoretical Mobius function) — the zeta and Mobius functions are inverses with respect to the Dirichlet convolution. It's nice to see that everything fits nicely across contexts. It's also interesting to note that any treatment of Dirichlet series can be transformed into a treatment of multivariate formal series (with countably many variables) — in that sense, it is possible to come up with a theory for posets (or at least lattices) analogous to that of Dirichlet series.
It might seem unrelated (and somewhat meta), but wherever there's a decent looking transformation, it is usually accompanied by structure that allows us to do things resembling Fourier analysis and apply generating functions. Here, the Mobius and zeta transforms are connected to the Mobius and zeta functions via the Dirichlet convolution being the "underlying" convolution that allows you to do Fourier analysis. Similarly, the Mobius function for posets is connected to Fourier analysis in a similar way — for instance, this paper.
Here's one connection between incidence algebras of posets (the underlying structures for things like Mobius inversion on posets) and Dirichlet series (following the treatment in this paper which is incidentally a great paper if you want to understand what generating functions are at their very core):
An incidence algebra is the set of functions mapping intervals of posets to some ring (usually $$$\mathbb{R}$$$ or $$$\mathbb{C}$$$, where addition and scalar multiplication are defined pointwise, and multiplication is defined as a convolution (think of the functions as a matrix with potentially missing entries, and multiplication as matrix multiplication but ignoring the missing elements). For example, the zeta function corresponds to the constant matrix $$$\zeta_{ij} = 1$$$ for $$$i \le j$$$, and the Mobius function corresponds to the inverse of this matrix.
A reduced incidence algebra of an incidence algebra (wrt an equivalence relation) is a subalgebra of the incidence algebra consisting solely of functions which have this special property: if intervals $$$[a, b]$$$ and $$$[c, d]$$$ in the underlying poset are equivalent under the equivalence relation, then the values of the function for these inputs is the same. There is a constraint on the equivalence relation that is needed — that it is compatible with the order (i.e., if $$$f, g$$$ are members of the incidence algebra that take the same values separately on each pair of equivalent intervals, then $$$f * g$$$ satisfies this property too), For instance, the identity, zeta and Mobius functions are all members of any reduced incidence algebra.
The idea is to show that the ring of Dirichlet series is isomorphic to a reduced incidence algebra of the divisibility poset (in fact, the ring of ordinary generating functions is isomorphic to a reduced incidence algebra of the linear poset — the natural numbers, and that of exponential generating functions corresponds to one of the subset poset, and the proof is very similar to the one below).
To do that, consider the following equivalence relation: $$$(a, b) \equiv (c, d)$$$ iff $$$a / b = c / d$$$ (note that $$$b / a$$$ is a positive integer for $$$[a, b]$$$ to be an interval in the divisibility poset). This is compatible with the natural partial order of the divisibility poset. The functions $$$f_k$$$ defined by $$$f_k(a, b) = 1$$$ if $$$b = ka$$$ and $$$0$$$ otherwise form a basis of the reduced incidence algebra arising from $$$\equiv$$$. It's easy to see that $$$f_k f_l = f_{kl}$$$, so considering the linear map that transforms $$$f_k$$$ to $$$1/k^t$$$ (which ends up mapping any member of this reduced incidence algebra to a Dirichlet series) gives us an isomorphism from this reduced incidence algebra to the ring of Dirichlet series.
Here's why this isomorphism is so important: it allows us to reason about convolutions in terms of combinatorial structures that arise naturally in the theory of posets. The incidence coefficients (that is, for equivalence classes $$$c_1, c_2, c_3$$$, the number $$$c_{c_1, c_2, c_3}$$$ of triples $$$(x, y, z)$$$ such that $$$[x, y], [y, z], [x, z]$$$ are all intervals with equivalence classes $$$c_2, c_3, c_1$$$) allow us to collapse the structure of the poset into a reduced incidence algebra in a Fourier-esque manner. In fact, the general claim is the following:
By the way, this is the fastest paper I found for prime-counting using combinatorial techniques; their algorithm takes $$$O(n^{2/3} / \log^2 n)$$$ time and $$$O(n^{1/3} \log^3 n \log \log n)$$$ memory.
I'll summarize a version which uses $$$O(n^{1/2})$$$ memory, phrased in terms of the Euler transform. Given a function $$$F$$$, and let its inverse pseudo-Euler transform be $$$G$$$. Then, given $$$F(\lfloor n / z\rfloor)$$$, this algorithm lets us compute $$$G(N)$$$ in $$$O(n^{2/3} / \log^2 n)$$$ time if $$$F$$$ is multiplicative, and $$$O(n^{2/3})$$$ time otherwise. Unfortunately it does not compute $$$G(\lfloor n / z \rfloor)$$$ for all $$$z$$$.
As usual, we will perform all operations on a "compressed representation" of Dirichlet series, storing $$$\sum_{i=1}^{\lfloor n / z \rfloor)} f(i)$$$ for all distinct values $$$\lfloor n / z \rfloor$$$. We'll use the notation that a lower-case letter (e.g. $$$f(i)$$$) represents the values of the Dirichlet series, while the upper-case letter (e.g. $$$F(i)$$$) represents the prefix sums; convolution is always on the underlying Dirichlet series.
First, we first sieve out all primes up to $$$n^{1/4}$$$ using a Fenwick tree, a la Lucy's algorithm. Concretely, for each $$$2 \le i \le n^{1/4}$$$ in order, multiply by $$$F$$$ by $$$1 - g(i)i^{-s}$$$. Using the square root trick, this takes $$$O(\sqrt{n / i})$$$ range-sum queries and point updates to $$$f$$$. If we maintain $$$F$$$'s compressed representation in a Fenwick tree/segment tree, this requires $$$O(\log n)$$$ time per operation, for a total of $$$O(\log n \sum_{i=2}^{n^{1/4}} \sqrt{n/i}) = O(n^{5/8} \log n)$$$ time. If we know $$$F$$$ is multiplicative, then $$$G$$$ is sparse and we only need to take $$$i$$$ prime (or prime power), so this step takes $$$O(n^{5/8})$$$ time.
Now, after we've sieved out all primes up to $$$n^{1/4}$$$, let the final resulting function be $$$F_3$$$. By definition,
Since all of the $a_j$ are at least $$$n^{1/4}$$$, we know that $$$k \le 3$$$. Thus,
Note that we only need values of $g$ up to $$$n^{3/4}$$$. We don't directly have the values of $$$g$$$ so we approximate $$$g(i) \approx f_3(i)$$$ (this is true for $$$i \le n^{1/2}$$$) and then do some PIE to compensate:
All of these terms can be computed in terms of prefix sums $F_3(\lfloor n / z \rfloor)$. The term with $$$xyz$$$ is the only term that takes more than $$$\omega(n^{1/2})$$$ time to compute, and takes $$$O(n^{2/3})$$$ time to enumerate all possible values of $$$x$$$ and $$$y$$$. If $$$F$$$ is multiplicative, then we only have to enumerate terms with both $$$x$$$ and $$$y$$$ prime, so this takes only $$$O(n^{2/3} / \log^2 n)$$$ time.
Thus this overall algorithm computes $$$G(n)$$$ in $$$O(n^{2/3} + n^{5/8} \log n)$$$ time for general $$$F$$$, and $$$O(n^{2/3} / log^2 n + n^{5/8})$$$ time for $$$F$$$ multiplicative.
Here's a quick list of differences from the direct $$$\log(F)$$$-based pseudo-inverse algorithm (note that the log-based algorithm takes $$$O(n^{2/3})$$$ time for general functions and $$$O(n^{2/3} / \log n)$$$ for multiplicative; it also does not work over general rings, because it requires dividing by small integers):
One follow-up: I'm not sure that using a segment-tree/Fenwick tree is optimal; seems like you should want something balanced a little differently due to this problem's structure. Also, $$$n^{5/8}$$$ is pretty much the same as $$$n^{2/3}$$$, so the constant factors on the Fenwick tree probably do hurt in practice.