TheScrasse's blog

By TheScrasse, history, 21 month(s) ago, In English

Hello everyone,

yesterday I hacked the following submission: 195599340. It wasn't trivial for me (I found it more difficult than solving the problem itself), but it doesn't require any weird mathematical knowledge. So, if you've never hacked before, you may want to try to hack that submission by yourself. Anyway, I wanted to share the hack as a tutorial (since I have not found similar blogs on Codeforces). Here is the "solution":

Solution

Let's read the submission. It calculates two hashes (stored in val[x]). We would like to generate a collision (= two non-isomorphic subtrees that have the same hashes). Let's notice some details:

  • The hash is deterministic (it doesn't rely on random variables), so it's potentially vulnerable.
  • The hash of small trees looks "manageable" (= it doesn't grow very fast), so it may be possible to find a collision just using small trees.

So, let's start printing the hashes of small trees. Actually it doesn't work (all those hashes are distinct). However, we can try merging those trees. In particular, if we attach a subtree with hash val[x] to node r, val[r] becomes {val[r].first * val[x].first * treeSize[x], val[r].second + (val[x].second * (treeSize[x] + 1))} (in modulo).

Now we want to attach subtrees to two nodes r, s, in such a way that their hashes become the same. So, we are interested in products of $$$a_i =$$$ val[x].first * treeSize[x] and sums of $$$b_i =$$$ val[x].second * (treeSize[x] + 1). Let's print their values for small trees $$$T_i$$$:

  • $$$T_1 = \{(1, 2)\}$$$, $$$a_1 = 6$$$, $$$b_1 = 4$$$
  • $$$T_2 = \{(1, 2), (1, 3)\}$$$, $$$a_2 = 16$$$, $$$b_2 = 9$$$
  • $$$T_3 = \{(1, 2), (2, 3)\}$$$, $$$a_3 = 24$$$, $$$b_3 = 15$$$
  • $$$T_4 = \{(1, 2), (1, 3), (1, 4)\}$$$, $$$a_4 = 40$$$, $$$b_4 = 16$$$
  • $$$T_5 = \{(1, 2), (1, 3), (2, 4)\}$$$, $$$a_5 = 60$$$, $$$b_5 = 24$$$
  • $$$T_6 = \{(1, 2), (2, 3), (2, 4)\}$$$, $$$a_6 = 80$$$, $$$b_6 = 40$$$
  • $$$T_7 = \{(1, 2), (2, 3), (3, 4)\}$$$, $$$a_7 = 120$$$, $$$b_7 = 64$$$

Let's find two distinct multisets of trees with equal $$$\prod a_i$$$ and $$$\sum b_i$$$. That's equivalent to finding non-zero coefficients $$$e_i$$$ such that $$$\prod a_i^{e_i} = 1$$$ and $$$\sum b_ie_i = 0$$$ (if $$$e_i$$$ is positive, it contributes to a multiset; if it's negative, it contributes to the other multiset).

$$$\prod a_i^{e_i} = 1$$$ if the multiplicity of each prime factor is $$$0$$$. For example, we can get rid of factors $$$\neq 2$$$ by enforcing $$$e_3 = -e_1$$$, $$$e_6 = -e_4$$$, $$$e_7 = -e_5$$$ (so, we get $$$a_1^{e_1} \cdot a_3^{e_3} = (a_3/a_1)^{e_3} = 2^{2e_3}$$$, etc.).

Now we have only $$$2$$$ equations (one for the multiplicity of $$$2$$$, one for the sum), and more than $$$2$$$ unknowns, so we can find a non-zero solution. One example is $$$[16, 0, -16, -69, 37, 69, -37]$$$. The total number of nodes is smaller than $$$2 \cdot 10^5$$$, so we have found a hack.

Conclusions

In the next div3 / educational round, it's up to you to hack such submissions!

Bonus

Can you hack 195587728? It's still deterministic, but the hash function seems stronger. Maybe we can generate random trees with $$$> 10^9$$$ nodes in total and hope that a collision happens (birthday paradox), but I have not tried.

  • Vote: I like it
  • +116
  • Vote: I do not like it

| Write comment?
»
21 month(s) ago, # |
  Vote: I like it -15 Vote: I do not like it

please dont hack hashes and delete this post

»
21 month(s) ago, # |
Rev. 2   Vote: I like it +5 Vote: I do not like it

How to hack a hash with random seed depends on the system timestamp?

Spoiler
»
21 month(s) ago, # |
  Vote: I like it 0 Vote: I do not like it

This reminds me of a contest:

  • During the contest:$$$mod 2^{64}$$$ string hash got WA and I didn't know what was wrong.
  • $$$5$$$ minutes after the contest:Changed mod to $$$998244353$$$ and got AC immediately :) .