A summary of what passes for an Indian ICPC Contest, apparently

Revision en7, by shiven, 2024-11-27 23:07:26

The online round for the India’s ICPC was held on 23rd. And also a week before that. The first time around, the organisers tried debuting a self-hosted DOMjudge instance. That didn’t go too well, so a recontest was scheduled on CodeChef.

If you’re not too familiar with the Indian ICPC system — this online prelims is used to select teams that can participate in the offline regionals. From a single university, usually approx. 2 teams (for the top unis) or 1 (for almost all unis that aren’t in top-15) qualify.

As you could guess, it’s quite easy for good teams to miss a spot. Looking at the first contest’s rankings, teams from rank 2 to 100+ all had 4 solves. Owing to severe system issues in the first contest, the time penalties weren’t too reliable differentiators, leading to the recontest this Saturday.

In hindsight, we forgot to be grateful for the problemset and reliable testcases the first time. I’m sure everyone agrees that the basic requirements (ignoring all other stuff like problem quality, balance, …) from an online contest would be:

  1. A functioning judge
  2. Correct, validated testcases
  3. Testcases sufficient enough to weed out obviously wrong solutions. (Notice the obviously: as we’ll soon see, the official solution shared by the contest admin isn’t even in polynomial time. Of course, it passes all cases with $$$N=3000$$$)
  4. Problems that can’t be easily Googled or insta-GPTed.

The first contest lacked (1), while the re-contest failed at (2), (3), and (4).

However, the organisers seem to believe that the re-contest was a fair metric just because CodeChef didn’t break, so here’s a compilation of what went wrong. Maybe they re-evaluate their decisions, maybe they realise that we deserve better. Regardless, even as part of a team that qualifies for both regionals from either version of the prelims, the experience was too bad to pass as a standard for ICPC.

Disclaimer: This is just a list of issues that I sourced by talking to the handful of teams at my university. Since the official Telegram group was blocked by the organisers following the re-contest (with a final message requesting to ‘trust the process’), there wasn’t an avenue to discuss the details. Hopefully this blog becomes one, because I’m sure that I’ve just discovered a small subset of issues. Additionally, all submissions are still hidden so we can’t figure out any more issues by looking at accepted code.

The re-contest

It had 5 problems, available here.

4, for all practical purposes, since the first was a cakewalk (and insta-GPTable). And also easily Googleable, because it is the exact same as another CodeChef problem. Also, as will be a recurring theme here, the official solution is easily hacked. Just to point out the obvious, whenever this happens, it means that the in-contest testcases don’t account for this, because the solution written by the contest admin would agree with / be the model solution.

Now for the remaining problems.

Small Indices

Here’s a summary of (a subset of) what ACs for this problem. A reminder that $$$N=3000$$$.

  • GPT writes an obvious $$$O(N^3)$$$ DP. Works.
  • $$$O(N)$$$ greedy. Wrong, ofc. But works.
  • $$$O(N)$$$ DP. Wrong, ofc. Also works.
  • $$$O(2 ^ {\log^2 N}) \approx 2^{121}$$$ brute force, written by the contest admin, finalised as the official solution.

Let’s talk more about this last one.

This video shows the original problem had $$$N=10^5$$$, presumably with some incorrect model solution. The admin then ACed with a recursive brute-force, with the above non-polynomial complexity. They probably discussed this and realised that the earlier faster solution was wrong, and also (wrongly) agreed that the brute-force works in $$$O(N^2)$$$. So they reduced the problem constraints to intend $$$O(N^2)$$$ solutions to pass.

Somehow, the end result → neither does the model brute-force solution work in time, nor do the test-cases enforce any time complexity (given that the model solution was itself guilty of being the slowest).

To add to that, the solution is presented in a way that pretends that the complexity is obviously correct.

The claim basically is: ‘if the recursion depth is $$$N$$$ and any single chain branches off atmost $$$K$$$ times, the tree has atmost $$$N2^K$$$ nodes’. There is no attempt to justify this. In fact, it is not even stated anywhere explicitly. Just assumed. The video goes: ‘maybe there are not too many states, let’s try’. It passes their own bad testcases (remember, still for $$$N=10^5$$$), and it was thus finalised as a problem. I’m surprised that this was not a huge red flag for the setting team.

We quite easily hacked it, and Igen reported it in detail. The response? A band-aid of caching on top, and a prompt ‘can you now find a counter-case’? Come on :\ Can we now get an acknowledgement of how much was messed up in this problem? Can setters now admit that a contest needs a provably correct solution? (On a side-note, our team ACed with a DP with $$$N^2 \log N$$$ states. It seems correct to us, but given that the test-cases apparently let anything slide, our word is as good as anyone’s. Can detail our solution later if needed.)

Yet Another GCD Problem

Constraints mention $$$K \geq 1$$$. Testcases don’t agree with this :(

No clarification regarding this was made during the contest. Now in a 2.5h ICPC-style penalty contest… you get the idea of all that can go wrong due to this. In all the contests that I’ve heard or been a part of, the organising team monitors for any such obvious validator mess-ups that lead to failing submissions. They announce the issue ASAP (and fix it or call it off, depending on the severity). This didn’t happen here. This was addressed after being pointed out by participants in CF comments of the official editorial.

Equations

After the contest, we pasted the statement into ChatGPT. Without any prompting, it quickly wrote a DSU which maintains a bi-colouring with other required info. Great.

This is not surprising at all. All of our immedate reactions on reading this problem — why does this library-checker problem exist? Just a thin veil of ‘observation’ (which is itself very standard — make a graph when pairwise constraints on sum, XOR, or whatever). As is well known, LLMs excel at these.

Okay, maybe I should have been more careful while calling the ‘observations’ lacking. Because among the only novelties in the problem was realising the $$$w=0$$$ case, and it was apparently enough to mess up the testcases. The official solution gets hacked... again.


As a sidenote, I appreciate the commitment from the organisers to hold a re-contest after the judge broke. Unfortunately, even after the re-contest, it doesn’t look like we’re in a better situation. You don’t expect the most awaited annual competition [with guesstimated upwards of 3M INR collected just as registration fees, along with sponsorships and stuff] to be far worse then online contests held every week throughout the year on every OJ.

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en7 English shiven 2024-11-27 23:07:26 36
en6 English shiven 2024-11-27 22:04:28 0 (published)
en5 English shiven 2024-11-27 22:02:44 18 Tiny change: 'd it, and [user:Igen,2024-11-27] [reported' -> 'd it, and Igen [reported'
en4 English shiven 2024-11-27 22:01:54 18 Tiny change: 'd it, and Igen [reported' -> 'd it, and [user:Igen] [reported'
en3 English shiven 2024-11-27 21:04:17 24 Tiny change: 'he online qualifier round for' -> 'he online round for'
en2 English shiven 2024-11-27 20:53:05 80
en1 English shiven 2024-11-27 20:45:35 7459 Initial revision (saved to drafts)