As many people may have noticed, I've done quite a number of TL hacks on 2000G - Call During the Journey. The hacked submissions even include many masters' and grandmasters' solutions, and most of them don't even have suspicious running time on the original tests. I predict many more solutions will fail on system testing because I've checked only some of C++20 and C++17 submissions and haven't tried running all the hack cases. So, how did they get time limit exceeded?
They're actually caused by a few different mistakes, and we need different strategies to hack each of them.
For convenience I'll assume that $$$\mathcal{O}(n)$$$ = the number of nodes = the number of edges, because whether the factor becomes $$$n$$$ or $$$m$$$ can vary depending on the implementation, and the number of edges is likely to be at least the number of nodes. Also for convenience (and as this problem is like that) I'll assume that the graph is bidirectional, but it works the same for directed graphs (only a bit harder to slowdown due to smaller constants).
For reference, you can also read my old article in Korean, but the content may be quite different.
1. Revisiting the same node
This is the most common mistake on Dijkstra implementations. By revisiting it means that it checks all adjacent edges of the same node multiple times. There are a few well-known ways to prevent this:
- Check if the distance information took from the
priority_queue
matches the information written in the distance array. - Make a visit-check array and mark it after taking it from the queue for the first time.
- Use a
std::set
instead of astd::priority_queue
, and don't let duplicated nodes in the set by removing the old element before updating. (Personally I don't recommend this because... it's a slowset
...)
But if for some reason you forgot to do none of them, then the problem occurs. When we use a priority_queue
for Dijkstra algorithm, we usually put the same node into the queue whenever the distance is updated. This means that in the worst case we can have $$$\mathcal{O}(n)$$$ elements of the same node in the queue. This itself is okay, but what happens if you check all the edges attached to it? It can have $$$\mathcal{O}(n)$$$ edges, so it will check a total of $$$\mathcal{O}(n^2)$$$ edges.
So how we want to hack it is to force the algorithm to update the same node $$$\mathcal{O}(n)$$$ times, while the updated node also has $$$\mathcal{O}(n)$$$ edges. Here is a simple way to do it:
- $$$i$$$-weighted edges: $$$1$$$ <-> $$$2$$$, $$$1$$$ <-> $$$3$$$, $$$1$$$ <-> $$$4$$$, ..., $$$1$$$ <-> $$$n-2$$$
- $$$inf -2i$$$-weighted edges: $$$2$$$ <-> $$$n-1$$$, $$$3$$$ <-> $$$n-1$$$, $$$4$$$ <-> $$$n-1$$$, ..., $$$n-2$$$ <-> $$$n-1$$$
where $$$i$$$ is the number of the node (except for node $$$1$$$ and $$$n-1$$$). This makes nodes from $$$2$$$ to $$$n-2$$$ are visited in order, while each of them updates node $$$n-1$$$ with a slightly less distance every time. Therefore, node $$$n-1$$$ will be dequeued $$$\mathcal{O}(n)$$$ times, and it will check its $$$\mathcal{O}(n)$$$ edges (although none of them will actually update any node's distance), resulting in $$$\mathcal{O}(n^2)$$$ in total.
The edge to the $$$n$$$ can be anywhere with $$$inf$$$ weight. It just needs to be visited at the very end.
Hacks: https://codeforces.net/contest/2000/hacks/1067650 and https://codeforces.net/contest/2000/hacks/1068253 . It required two versions because some people started Dijkstra algorithm from node $$$1$$$ and others did from $$$n$$$.
2. Visiting furthest nodes first
It's also a common mistake to make the priority queue keep the furthest node at the top, not the closest node. Although it looks ridiculous, random tests aren't very good to hack these either. You can see how my intended solution 276237524 worked in 890 ms while the reversed operator 276320086 worked in 577 ms in the original tests.
To hack this, we want to make a long chain of nodes, and make that chain to be updated many times. Let's say we have a chain of nodes $$$A$$$ — $$$B$$$ — $$$C$$$ — $$$D$$$ — $$$E$$$ connected with edges of weight $$$1$$$. We can also make edges from $$$1$$$ to each of them in increasing order of weight with weight difference of $$$2$$$. Then, with the reversed operator, $$$E$$$ will be visited first, but when $$$D$$$ is visited from $$$1$$$ it will also revisit $$$E$$$, and when $$$C$$$ is visited it will revisit both $$$D$$$ and $$$E$$$, visiting $$$B$$$ makes all $$$C$$$, $$$D$$$, $$$E$$$ revisited, and we will also get to the whole $$$ABCDE$$$ chain. When generalized, we can have a chain of $$$\mathcal{O}(n)$$$ nodes, each of them visited $$$\mathcal{O}(n)$$$ times, in total taking $$$\mathcal{O}(n^2)$$$ time.
Hack: https://codeforces.net/contest/2000/hacks/1069602
3. Visiting in order of node number
This also happens a lot because many people use std::pair
of each $$$(node, distance)$$$ in priority_queue
. Because the comparator of std::pair
compares first
first, by default the node with larger number will be popped first. If one used greater<pair<int, int>>
, then the smaller node number precedes. In any case, it's still hard to hack them just with random tests. See 276321227 and 276321937.
The basic idea for hacking these is the same as 2. This time it just needs to be the node number instead of distance, while controlling our intended visit order to keep updating the other nodes' distance.
Hacks: https://codeforces.net/contest/2000/hacks/1069643 and https://codeforces.net/contest/2000/hacks/1069602 . I used a bit different pattern here, but I guess it's weaker than the other one.
4. Not large enough infinity value
It's also a common case that the value used for representing infinity isn't large enough. Although this was well-prevented in this problem, in many cases it's easy to miss them. For instance, many graph problems have much tighter constraint on the weight of edges compared to the possible answer, and if the tests are purely random, it is very likely that the largest answer remains extremely small compared to the limit. Therefore, there needs at least one case of a linear tree graph where all edges have the maximum weight to test if the solution has large enough $$$inf$$$ value.
To problem writers who are planning to make Dijkstra problems...
Dijkstra algorithm is a basic but very interesting subject, and many people would love to create wonderful problems based on it. However, as shown in this blog, making strong tests against wrong implementations of it requires deep consideration of what can actually be mistaken, and relying on large $$$n$$$ and random tests will most likely fail to prevent them from getting AC. I'd like to urge those who want to make Dijkstra problems to take some time to write various wrong implementations and see if your tests can actually make them fail.
bruh <(")
Thanks for your useful blog. (bruh this blog describes me on div2
Great blog!
you're legend
This guy hacks a lot! He once hacks me too and now I add him as a friend:))
This guy hacks a lot! He never hack me nor tried, and now I pray he won't target me:))
Thank you for this blog, I got hacked by you lol
Mother**cker (** = ha, not what you thought, dummy)
hi. how to hack unordered map,sets? thanks for the blog!
I have a plan to write a separate blog regarding investigation on hacking
unordered
structures, but for now you can refer to https://codeforces.net/blog/entry/62393 .thanks. you are a legend!
Thanks, this was very helpful as it is really difficult to make tests against bad Dijkstra implementation.
If revisiting the same node,it might be SPFA, but SPFA is $$$O(n^2)$$$ anyway...
Check if the distance information took from the priority_queue matches the information written in the distance array.
Can someone explain how this helps in not revisiting the same node?
Normally in a Dijkstra implementation, a node is only repeated if the algorithm found a shorter distance for it, so no two distances are the same.
Also, due to how priority queue works, the first time you visit that node would have the distance information being exactly the best value (that the array is storing).
If it doesn't match, it means you are accessing nodes of "older" updates (that had been turned obsolete by the information you have already progressed). Thus, it's safe to ignore.