Hi everyone!
As I continue working through Project Euler, I want to write a blog about another piece of general mathematical knowledge that is both interesting on its own and might be useful in some problems. Consider the following Diophantine equation:
We assume that we're given a specific number $$$z \in \mathbb Z$$$, and we need to check if there are $$$x, y \in \mathbb Z$$$ for which the identity above holds. Then, if such numbers exist we should find them and report. Example of a problem that might need it:
Timus — 1593. Square Country. Version 2. You're given an integer $$$n$$$. Find minimum $$$k$$$ such that $$$n = a_1^2+\dots+a_k^2$$$.
Tl;dr.
Let $$$z = 2^{k_0} p_1^{k_1} \dots p_n^{k_n} p_{n+1}^{k_{n+1}} \dots p_m^{k_m}$$$, where $$$p_1, \dots, p_n$$$ are different prime numbers with remainder $$$3$$$ modulo $$$4$$$, and $$$p_{n+1}, \dots, p_m$$$ are different prime numbers with remainder $$$1$$$ modulo $$$4$$$. Then there are two cases. If any of $$$k_{1}, \dots, k_n$$$ is odd, there are no solutions. Otherwise there is always a solution $$$z = x^2 + y^2$$$ that looks like
where $$$i^2=-1$$$ and $$$x_k^2+y_k^2 = p_k^2$$$ for $$$k$$$ from $$$n+1$$$ to $$$m$$$. For each $$$p_k$$$, to find such $$$x_k, y_k$$$ we need to find an integer $$$i$$$ such that $$$i^2 \equiv -1 \pmod{p}$$$, then find a minimum $$$x_k = i y_k \bmod p_k$$$ for $$$1 \leq y_k < \sqrt {p_k}$$$. This is doable in $$$O(\log p_k)$$$.
And if we want to count solutions, their number is given by Jakobi's two-square theorem: The number of ways of representing $$$z$$$ as the sum of two squares is $$$4(d_1(z) - d_3(z))$$$, where $$$d_k(z)$$$ is the number of divisors of $$$z$$$ that have remainder $$$k$$$ modulo $$$4$$$.
From exact equation to $$$\bmod z$$$
First of all, let's solve a bit easier problem. One obvious thing we can do is to take remainder modulo $$$z$$$ on both parts:
This relaxed version is equivalent to finding $$$x, y, k \in \mathbb Z$$$ such that
Hmmm... Remainders modulo arbitrary number $$$z$$$ is not the most pleasant thing to work on directly. But remainders modulo prime number $$$p$$$ are usually nice. On the other hand, if $$$p$$$ is some prime factor of $$$z$$$ and there is a solution for $$$z$$$, it means that there will as well be a solution for $$$p$$$ with $$$k=\frac{z}{p}$$$. So, let's assume $$$z$$$ to be prime, for now.
From arbitrary $$$z$$$ to prime $$$p$$$
Now, we have another equation
where $$$p$$$ is a prime number. What's good about prime numbers is that remainders modulo prime numbers form a field (i.e. they work very similarly to rationals, and we can expect similar results to hold). For $$$p=2$$$, there is a non-trivial solution $$$x=y=1$$$. What about odd numbers $$$p$$$? There are two cases to consider, as the remainder of $$$p$$$ modulo $$$4$$$ is either $$$1$$$ or $$$-1$$$.
Fermat's theorem on sums of two squares tells us that for an odd prime $$$p$$$, the solution exists if and only if $$$p$$$ has a remainder $$$1$$$ modulo $$$4$$$. Moreover, the sum of two squares theorem tells us that the number $$$z$$$ is expressible as a sum of two squares if and only if its prime decomposition does not have a term $$$p^k$$$, where $$$p \equiv -1 \pmod 4$$$, and $$$k$$$ is odd. Let's find out why.
$$$p \equiv 1 \pmod 4$$$
Of course, it's not yet clear why these two cases are important. Let's assume that there is an integer $$$i$$$ such that
that is there is a remainder modulo $$$p$$$ which behaves very similarly to imaginary unit from complex numbers. Then
This reduces the initial equation to a union of linear equations
For each $$$y$$$, except $$$y=0$$$, there are $$$2$$$ possible values of $$$x = \pm iy$$$, so there are a total of $$$2p+1$$$ solutions. Noteworthy, it is always possible to find a pair of solutions $$$(x,y)$$$ such that $$$1 \leq x, y < \sqrt p$$$, which means that $$$x^2 + y^2 = p$$$ is satisfied exactly.
How to find it? Find $$$i$$$, and consider the minimum value of $$$ik\bmod p$$$ among $$$1 \leq k < \sqrt p$$$. Due to pigeonhole principle, there will be $$$k_1 \neq k_2$$$ such that $$$i (k_1 - k_2) \bmod p \leq \sqrt p$$$. This is actually very similar to 102354I - From Modular to Rational!
Now, when does such $$$i$$$ exist and how to find it? It is known that remainders modulo $$$p$$$ have a primitive root $$$g$$$ such that its powers from $$$0$$$ to $$$p-2$$$ run through all possible remainders modulo $$$p$$$. Note that for odd $$$p$$$ it always holds that
Then, if such $$$i$$$ exists we should be able to find it from
Well, technically $$$-g^{\frac{p-1}{4}}$$$ also can be used as $$$i$$$, but it's not that important. What's important is that it is possible to do as above only when $$$p-1$$$ is divisible by $$$4$$$. In other words, when $$$p \equiv 1 \pmod 4$$$.
$$$p \equiv -1 \pmod 4$$$
Now, let's think about the other case. If there is no such $$$i$$$, we can introduce it! Now, we can formally consider numbers that look like $$$x+iy$$$, where $$$i$$$ is not a remainder modulo $$$p$$$. Numbers of this kind, if treated formally, also form a field. If you're familiar with field theory, I should mention that it is isomorphic to the Galois field $$$GF(p^2)$$$. If you're not familiar with it, ignore what I just wrote.
The thing now is that we can try to find all solutions in this new, extended field. And it reduces to the same union of equations
so for every $$$y$$$, the only possible solutions are $$$x = \pm iy$$$. The problem is, this time such $$$x$$$ would not be a remainder modulo $$$p$$$, unless $$$y=0$$$. Instead, it will be an "imaginary" solution. So, the only "real" solution is $$$x \equiv y \equiv 0 \pmod p$$$. It means that all solutions to
look like $$$x = px'$$$ and $$$y=py'$$$. Thus,
So, if $$$k$$$ is not divisible by $$$p$$$, there are no solutions. Otherwise $$$k=pk'$$$ reduces it to
after which similar argument could be applied. So, if $$$k'$$$ is divisible by an odd power of such $$$p$$$, there are no solutions. We're only one step away from solving the whole $$$x^2+y^2=z$$$ problem now, assuming that we know the factorization of $$$z$$$.
Back to arbitrary $$$z$$$
Now we need to use one more fact from complex numbers. There, we can introduce a norm
Its crucial property for this task is that it is multiplicative, that is
This gives the Brahmagupta–Fibonacci identity
from which it follows that if we can represent $$$z$$$ as a product of several several numbers that are expressible as a sum of two squares, we can use the identity above to also express $$$z$$$ as a sum of two squares. In complex number terms, it means that we will find a complex number $$$x+iy$$$ such that $$$\|x+iy\| = z$$$.
Repeating what's written in tl'dr section, let $$$z = 2^{k_0} p_1^{k_1} \dots p_n^{k_n} p_{n+1}^{k_{n+1}} \dots p_m^{k_m}$$$, where $$$p_1, \dots, p_n$$$ are different prime numbers with remainder $$$3$$$ modulo $$$4$$$, and $$$p_{n+1}, \dots, p_m$$$ are different prime numbers with remainder $$$1$$$ modulo $$$4$$$. Then there are two cases. If any of $$$k_{1}, \dots, k_n$$$ is odd, there are no solutions. Otherwise there is always a solution $$$z = x^2 + y^2$$$ that looks like
where $$$i^2=-1$$$ and $$$x_k^2+y_k^2 = p_k^2$$$ for $$$k$$$ from $$$n+1$$$ to $$$m$$$. This result is tightly connected to the following ones:
Classification of Gaussian primes. A Gaussian integer $$$a+ib$$$ is prime if and only if either of the following is true:
- $$$a=0$$$ or $$$b=0$$$ and the non-zero number is prime with remainder $$$3$$$ modulo $$$4$$$,
- $$$a^2 + b^2$$$ is $$$2$$$ or a prime number with remainder $$$1$$$ modulo $$$4$$$.
The result also allows to factor Gaussian integers by representing their norm as a sum of two squares in the way suggested above.
Jakobi's two-square theorem. The number of ways of representing $$$z$$$ as the sum of two squares is $$$4(d_1(z) - d_3(z))$$$, where $$$d_k(z)$$$ is the number of divisors of $$$z$$$ that have remainder $$$k$$$ modulo $$$4$$$.
If there is a prime divisor with remainder $$$3$$$ mod $$$4$$$, it's $$$0$$$. Otherwise, it is $$$4$$$ times the number of divisors of $$$p_{n+1}^{k_{n+1}} \dots p_m^{k_m}$$$. We may interpret it that the divisor decides how much of multipliers in the product would correspond to $$$x_{k} + iy_k$$$, and how much to $$$y_k + i x_k$$$, after which $$$4$$$ accounts for all the possible ways to multiply $$$x+iy$$$ with $$$1$$$, $$$-1$$$, $$$i$$$ or $$$-i$$$, which accounts for all possible combinations of $$$(\pm x, \pm y)$$$.
And, of course, there are similar results for $$$3$$$ squares and $$$4$$$ squares:
what is i % p ?
Any solution to $$$i^2 \equiv -1 \pmod p$$$. It can be found as $$$i \equiv g^{\frac{p-1}{4}} \pmod p$$$, where $$$g$$$ is a primitive root of $$$p$$$, if $$$p = 4n+1$$$. Otherwise, $$$i$$$ does not exist among remainders modulo $$$p$$$.
I proposed a trivial version of this problem on codechef once.
I got stuck on problem 66 "Diophantine equation" for a very long time two years ago, because I thought you could solve it without googling for anything, as almost all of the first project euler problems. Definitely couldn't :/. Good topic.
I think another blog is more appropriate for problem 66 :)
For those interested:
Task related to two/three/four-square theorems: BOJ 17633
Harder Task: BOJ 17646
That's very interesting! I wonder what's the construction for 3 and 4 squares is. Would it be sufficient to take one of squares at random until you get a number that factorizes into smaller number of squares?
Construction for 1 square is trivial I will go over rest.
Minimum number of squares needed is 4 : As we know a number cannot be represented as a sum of 3 squares if $$$4^a\left(8b+7\right) = n$$$ for any non-negative integer $$$a,b$$$ so a number that can be expressed as $$$8b+6$$$ has a way to express it as a sum of three powers. Assuming we can find a solution to $$$8b+6=x^2+y^2+z^2$$$ our answer will be $$$\left(x\cdot 2^a\right)^2+\left(y\cdot 2^a\right)^2+\left(z\cdot 2^a\right)^2+\left(2^a\right)^2$$$. Our next step will be finding the solution for 3 squares.
Minimum number of squares needed is 3 : We can just use randomization like you mentioned above. In my testing generating a random number less than $$$10^{18}$$$ and than generating random squares until we find a number such that we can represent our original number minus the square as sum of two squares it took less than 100 random squares generated for $$$10^{5}$$$ tests I ran.
I am very curious if there is a solution without randomization.
I and Vipin were discussing about Pythagorean triplets yesterday. I remembered a problem where we had to find the number of integer solutions of $$$a + b^2 = c^2$$$ till $$$10^5$$$. Does anyone remember where this problem has appeared?
Not exactly $$$a + b^2 = c^2$$$, but $$$a^2 + b^2 = c^2$$$ appeared HERE.
Similar problem: https://codeforces.net/contest/1487/problem/D
Bro how is your math so good? Did you prepare for some math Olympiad? I am really intimidated by your blogs.
Please don't be :)
I don't have that much of Math preparation, really. I just studied some basics in university, and I enjoy reading and/or writing about some stuff I find interesting from time to time.
Thanks for another beautiful article!
I have a question regarding implementation part: in tl;dr you wrote
For each $$$p_k$$$, to find such $$$x_k, y_k$$$ we need to find an integer $$$i$$$ such that $$$i^2 \equiv -1 \pmod{p_k}$$$, then find a minimum $$$x_k = i y_k \pmod{p_k}$$$ for $$$1 \le y_k < \sqrt{p_k}$$$. This is doable in $$$\mathcal{O}(\log p_k)$$$
How do you suggest to implement it in the given complexity?
I think I have a way to do it, but it is rather different from yours, so I am very interested, what did you mean.
1) First of all, I would struggle to find a primitive root. According to wikipedia, picking random number and verifying, is it a primitive root or not, takes $$$\mathcal{O}(\log \log p \cdot \log p)$$$ time: $$$\frac{p}{\phi(p - 1)} = \mathcal{O}(\log \log p)$$$.
On the other side, being primitive root is too strong property: for $$$p = 4k + 1$$$ consider any element $$$x \in \mathbb{Z}_p^*$$$. If $$$4 | ord(x)$$$, then $$$x^{\frac{ord(x)}{4}} = \pm i$$$ and $$$\mathbb{Z}_p^*$$$ contains at least $$$\frac{p - 1}{2} = \mathcal{O}(p)$$$ elements of that type (any odd power of a fixed primitive root fits). To avoid direct calculating of $$$ord(x)$$$ one can choose $$$i = x^{\frac{p - 1}{2^k}}$$$ for some $$$k$$$. Therefore, we can find $$$i$$$ with non-deterministic $$$\mathcal{O}(\log p)$$$.
2) However, I am far more interested in this, second, point. Any way of finding primitive root seems to be fast enough in practice.
How do you find such $$$y_k$$$ that minimizes $$$x_k$$$?
I can propose a different way: let $$$a \in \mathbb{Z}$$$ be an element from previous item: $$$a^2 \equiv -1 \pmod{p}$$$. Then one can find $$$gcd(a + i, p)$$$ in gaussian integers $$$\mathbb{Z}[i]$$$. It can be shown that $$$\parallel gcd(a + i, p)\parallel= p$$$:
and you can find $gcd$ in gaussian integers, since it's a euclidian domain. Moreover, each time you replace $$$(a, b) \rightsquigarrow (b, a \% b)$$$ the norm of second term at least halves, so $$$gcd$$$ workd in $$$\mathcal{O}(\log p)$$$.
For 1), I'd do it in standard way to find primitive root. But I like that your idea avoids computing $$$\operatorname{ord}(x)$$$, which requires knowing the factorization of $$$p-1$$$. On the other hand, raising random numbers to $$$\frac{p-1}{4}$$$ is fine too, as they're likely to be primitive roots.
For 2), I expected kind of same solution as to Library Judge — Min of Mod of Linear. There are several ways to do it, you can e.g. find minimum element with a Euclidean-like process, or with continued fractions (see the solution to 102354I - From Modular to Rational there).
But I like your solution more, it is an unexpected result for me that $$$\|\gcd(a+i, p)\| = p$$$.
Thank you, definitely would read the tutorials you provided.