Notes on FFT / NTT, and the "ultimate" NTT with modulus >= 9 * 10^18

Revision en5, by Spheniscine, 2020-03-29 09:55:25

Prerequisites: you need to be familiar with both modular arithmetic and Fast Fourier Transform / number theoretic transform. The latter can be a rather advanced topic, but I personally found this article helpful. Note that there are some relatively easy optimizations/improvements that can be done, like precalculating the modular inverse of $$$n$$$, the "bit-reversed-indices" (can be done in $$$O(n)$$$ with DP), as well as the powers of $$$\omega_n$$$. Also useful is modifying it so that the input array length can be any power of two $$$\leq n$$$; some problems require multiplying polynomials of many different lengths, and you'd rather the runtime be $$$O(n \log n)$$$ over the sums of lengths, rather than (number of polynomials * maximum capacity).

Also helpful is knowing how to use NTT to solve problems modulo $$$998244353$$$, like 1096G - Счастливые билеты and 1251F - Красно-белый забор. Note that for some problems it's easier to think of FFT not as multiplying polynomials, but of finding multiset $$$C$$$ as the pairwise sums of multisets $$$A$$$ and $$$B$$$, in the form of arrays $$$A[i] =$$$ number of instances of $$$i$$$ in multiset $$$A$$$. This is equivalent to multiplying polynomials of the form $$$\sum _{i=0} ^n A[i]x^i$$$.

Note that $$$\omega_n$$$ can be easily found via the formula $$$g ^ {(m-1) / n} \ \text{ mod } m$$$, provided that:

  1. $$$m$$$ is prime
  2. $$$g$$$ is any primitive root modulo $$$m$$$. It is easiest to find this before hand and then hardcode it in the submission. You can either implement the algorithm yourself or use Wolfram Alpha to find it via the query PrimitiveRoot[m]. (Spoiler alert, $$$g = 3$$$ works well for $$$998244353$$$)
  3. $$$n$$$ divides $$$m-1$$$ evenly. As $$$n$$$ is typically rounded up to the nearest power of 2 for basic NTT implementations, this is easiest when $$$m$$$ is of the form $$${a \cdot 2^{k} + 1}$$$ where $$$2^k \geq n$$$. This is why $$$998244353$$$ is a commonly-appearing modulus; it's $$$119 \cdot 2^{23} + 1$$$. Note that this modulus also appears in many problems that don't require FFT/NTT; this is a deliberate "crying-wolf" strategy employed by puzzle writers, so that you can't recognize immediately that a problem requires FFT/NTT via the given modulus.

Now onto the main topic, the "ultimate" NTT.

Rationale: There are a few problems like 993E - Никита и порядковая статистика that require FFT, however the results aren't output with any modulus, and indeed may exceed the range of a 32-bit integer type. There are several usual solutions for solving this:

  1. Do NTT with two moduli and restoring the result via Chinese Remainder Theorem. This has several prominent disadvantages:
    1. Slow, as the NTT routine has to be done twice.
    2. Complicated setup, as several suitable moduli has to be found, and their primitive roots calculated
    3. Restoring the result with CRT requires either brute force or multiplications modulo $$$pq$$$, which may overflow even 64-bit integer types.
  2. Do FFT with complex numbers and floating point types. Disadvantages are:
    1. Could be slow due to heavy floating-point arithmetic. Additionally, JVM-based languages (Java, Kotlin, Scala) suffer complications here, as representing complex numbers with object-based tuples adds a significant overhead.
    2. Limited precision due to rounding error. Typically the problems are constructed such that it won't be a problem if care is taken in the implementation, but won't it be nice to just not to have to worry about it?

To solve these problems, I propose the "ultimate" NTT solution — just use one huge modulus. The one I use is $$$9223372036737335297 = 54975513881 \cdot 2^{24} + 1$$$. This is just over a hundred million less than $$$2^{63} - 1$$$, the maximum value of a signed 64-bit integer.

However, this obviously raises the issue of how to safely do modular arithmetic with such huge integers.

Tags fft, ntt, modular arithmetic

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en29 English Spheniscine 2022-10-02 02:28:33 1 Tiny change: '7 = 54975513881 \cdo' -> '7 = 549755813881 \cdo'
en28 English Spheniscine 2021-12-24 05:54:46 36
en27 English Spheniscine 2021-12-24 05:52:05 2 Tiny change: 'se (I use ($2^{16}$) in these ' -> 'se (I use $2^{16}$ in these '
en26 English Spheniscine 2021-12-24 05:51:30 2420 Tiny change: ' is a $192$nd root of u' -> ' is a $192\text{nd}$ root of u'
en25 English Spheniscine 2020-05-27 11:41:09 77
en24 English Spheniscine 2020-03-31 10:16:10 4 Tiny change: 'oiler>\n\nUpdate: I have fu' -> 'oiler>\n\n**Update:** I have fu'
en23 English Spheniscine 2020-03-31 09:25:52 254
en22 English Spheniscine 2020-03-29 17:07:51 1 Tiny change: 'at $2^k > m$. $k = 63' -> 'at $2^k > n$. $k = 63'
en21 English Spheniscine 2020-03-29 17:05:27 4
en20 English Spheniscine 2020-03-29 13:29:25 4 Tiny change: 'for basic NTT implemen' -> 'for basic FFT implemen'
en19 English Spheniscine 2020-03-29 12:55:03 74
en18 English Spheniscine 2020-03-29 12:50:51 615 Tiny change: 'deforces?) $a^2n$ in' -> 'deforces?). $a^2n$ in'
en17 English Spheniscine 2020-03-29 12:28:13 237
en16 English Spheniscine 2020-03-29 11:43:47 4 Tiny change: 'and restoring the resul' -> 'and restore the resul'
en15 English Spheniscine 2020-03-29 11:42:44 27 Tiny change: 'tions for solving this:\n\n1. D' -> 'tions for these types of problems:\n\n1. D'
en14 English Spheniscine 2020-03-29 11:39:56 1
en13 English Spheniscine 2020-03-29 11:25:24 1 Tiny change: 'm >>> 62))) * MOD -' -> 'm >>> 62)) * MOD -'
en12 English Spheniscine 2020-03-29 11:23:06 6 Tiny change: ', b: Long) {\n xh' -> ', b: Long): Long {\n xh'
en11 English Spheniscine 2020-03-29 11:19:34 6 Tiny change: 'le moduli has to be found,' -> 'le moduli must be found,'
en10 English Spheniscine 2020-03-29 11:17:17 586 Tiny change: 'em:993E]\n[submiss' -> 'em:993E]\n\n[submiss' (published)
en9 English Spheniscine 2020-03-29 11:07:23 1751 Tiny change: 'rac{xr}{4^63} \right' -> 'rac{xr}{4^{63} \right'
en8 English Spheniscine 2020-03-29 10:40:31 1482 Tiny change: 'age:\n public' -> 'age:\n public'
en7 English Spheniscine 2020-03-29 10:17:38 962
en6 English Spheniscine 2020-03-29 10:02:51 351 Tiny change: '2^{24} + 1$. This is' -> '2^{24} + 1, g = 3$. This is'
en5 English Spheniscine 2020-03-29 09:55:25 382 Tiny change: 'e problems I propose' -> 'e problems, I propose'
en4 English Spheniscine 2020-03-29 09:48:48 795 Tiny change: 'he result requires ' -> 'he result with CRT requires '
en3 English Spheniscine 2020-03-29 09:38:21 493 Tiny change: 'ating the "bit-' -> 'ating the modular inverse of $n$, the "bit-'
en2 English Spheniscine 2020-03-29 09:27:37 17 Tiny change: '(m-1) / n}$, provide' -> '(m-1) / n} \text{ mod } m$, provide'
en1 English Spheniscine 2020-03-29 09:26:30 2329 Initial revision (saved to drafts)