Hello, CodeForces.
As I can't find any discussion thread in AtCoder, and the announcements of AtCoder contests are also posted here, I decided to post this blog in CodeForces.
Last time (ABC355), I reported a suspicious participant toyuzuko who solved problem ABCD in 51 seconds. They also did the similar thing in ABC354.
This time (ABC356), there are much more suspicious participants.
The average first-AC time (in seconds) of ABC347~ABC353 is in the following table:
Contest | A | B | C | D | E | F | G |
---|---|---|---|---|---|---|---|
ABC347 | 21 | 45 | 93 | 244 | 177 | 300 | 824 |
ABC348 | 15 | 37 | 72 | 302 | 236 | 495 | 430 |
ABC349 | 23 | 67 | 54 | 116 | 598 | 139 | 634 |
ABC350 | 26 | 47 | 102 | 74 | 240 | 489 | 335 |
ABC351 | 22 | 23 | 106 | 352 | 199 | 122 | 902 |
ABC352 | 27 | 43 | 49 | 90 | 196 | 989 | 188 |
ABC353 | 24 | 79 | 172 | 88 | 206 | 1161 | 313 |
(minimum) | 15 | 23 | 49 | 74 | 177 | 122 | 188 |
(prefix sum) | 15 | 38 | 87 | 161 | 338 | 460 | 648 |
Note that to get first-AC in a problem doesn't require you to solve all the problems before it. You can just skip to the problem you want to get first-AC and manage to solve it at once. So, the least reasonable time to solve a subset of problems is at least the sum of the first-ACs.
To make sure there are fewer exceptions, I will call all the participants who solved any subset of problems faster than the minimum expected time "suspicious".
Now, let's take a look at (some of) the suspicious participants who solved at least 4 problems at last:
- kyon2326 solved ABCD in 25 seconds.
- toyuzuko solved ABCD in 28 seconds.
- Harui solved ABD in 24 seconds.
- udon1206 solved ABC in 40 seconds.
Some other participants seem to be suspicious, too. They solved a subset of problems just 10 seconds slower than the sum of the minimum first-AC time in the above table. I think it's impossible to have all the easy problems in a single round, but I didn't list them here, just to make sure there are fewer exceptions.
I guess they probably used some AI to generate the code for them after seeing what happened in ABC355. If my guess is true, I'm impressed with the ability of AI. I'm also worried that more and more people will use AI to generate the code and solve easy problems in the contest. If everyone can solve ABCD by simply copy-pasting AI's code, what's the point of having AtCoder Beginner Contests? Why not just delete ABCD and add some new problems to make an AtCoder Regular Contest?
You can leave your opinion in the comments.
I feel scared that AI will be stronger than many people.
I feel scared that AI will be stronger than many people.
Peter Thiel seems to be right
Easy problems in abc are designed to require no algorithm but programming language.
AI masters programming languages very well and also know a little about algorithms.
It's normal for AI to solve these problems, or AI would be too weak.
You're right.
But in Span for 5 Seconds he submitted ABC in today's contest like even copy-pasting and making the solution will take at least some more time he/she submitted A in the 16th sec, B in the 19th sec, and C in the 21st second which is suspicious.
maybe created some extension
Maybe multiple people are using the same account.
If anyone has access to gpt4, he can give the tasks to the model and submit the code on Atcoder to check if it gives AC or not.
isn't gpt-4, now free for everyone?
It's actually confirmed that they used AI, because if you see their submissions such as https://atcoder.jp/contests/abc355/submissions/54073783 and https://atcoder.jp/contests/abc355/submissions/54073778, it says "Generated by gpt4-o" at the top.
I guess they don't have time to remove that, because they're in a hurry to submit as it is, even without counting the time to remove it.
i dont see why they should remove it either, its not like its disallowed
This validates my comment above. Its not a good news tbh. It may be the start of the end of ABC contests. I wonder how gpt4 performs on codeforces div2s and div1s.
Well, ABC are mostly for beginners anyway, and it only solves the very standard problems, which are usually not accepted on Div 2s (the problems on Div 2 may be standard for GM+ but they are still novel from the perspective of an AI which can only solve them if some exact same problem has appeared before).
But yes, this may be the start of the end of ABC contests, at least for $$$r \in [0, 800]$$$.
Honestly I think we're probably pretty close to having a freely available AI that is able to solve most 1600-1800 problems, since even those aren't usually too involved.
I think there might be classes of problems that are simple for humans but unusually hard for AI, like ones which require some kind of intuition or visualization that AI is not great at.
AI achieves 1800 CF rating
AI gets Gold at IOI 2024 but performs poorly on relatively easier Day 2B
Nice prediction : )
AI gets gold in IOI is not accurate. It only got gold when allowed 10k submissions.
True, it still flops at Day 2B with 10k submits...
This 10k submissions can be easily avoided by using local stress testing. The AI needs to be able to generate the brute force solutions, which seems reasonable. There could be some strategy like submitting the brute force and making sure you get TLE instead of WA to guarantee correctness, etc. Also at IOI you do not have limits of submission I think, so AI actually played by the rules.
At IOI you're limited to 50 submissions per task actually. They mention the (lower) score of the model when respecting this constraint.
They mention the model used testcases it created itself in order to develop the solutions. I'm not sure if it was allowed to run actual code against a brute force or if it just checks its logic "by hand".
But yeah, I do think it might be a matter of time before we're absolutely destroyed by bots lol. If they didn't test the model allowing it to run code, I'm afraid it already is much better than we think.
I think there might be limitations that AI cannot overcome while humans can. A big part of CP problems have a low level of originality, but I expect AI to struggle with original ad hoc problems. Maybe this is wishful thinking and AI will destroy us in every aspect, but I dont think so.
I just learned here that the AI generated thousands of submissions, but it only submitted 50. Which is according to the rules. It did use a stress test strategy.
Actually, some of they are use ChatGPT to get AC for some easiest problems.
Currently, there are no punishment rules for using AI, so skipping the first 2 or 3 problems in a few seconds are kind of "good" strategy. (background: AI has good compatibility to resolve ABC problems because some of them are very simple settings and solutions)
Recently (especially after launching gpt-4o) there have been many pros and cons about allowing or banning to use of ChatGPT in the Japanese community, and the rules regarding AI may be changed through the result of this discussion. Personally, I want to know the opinion of CF community members about this issue.
I think allowing AI is not a problem. Firstly, anyone can plug a given problem into ChatGPT, so its not giving an unfair advantage to particular users. Secondly, AI can't solve any problems with even somewhat challenging algorithmic thinking, so it really doesn't impact any serious competitors.
Thanks for reply!
Exactly because of it's almost not a problem for ARC or above, serious competitors are almost not care this situation in the difficult round (and I also don't hope to be do hard banning of AI).
But in ABC, the situation is different. In ABC356, fastest ABCD gets about 1400perf so every green or below coder "shold" use this method. From Japanese community, it seems this strategy problem and "lose to ChatGPT" itself demotivated them and some of community members think this isn't healthy.
In addition, the thing that ABC focuses on education and guideline rather than competition makes this problem more complex, then switch ABC to more competitive (for example, make them more ad-hoc) is not a solution...
I think that anybody can very quickly become better at problem solving than AI in its current state, so I'm not too worried. It should not take more than a month or two of serious practice.
I has seen someone with the same id as yours took part in some Luogu contests. However, I think you should pay more attention to Luogu's punishment for AI, because it's really useful and powerful. Using AI is the most unfair behavior for every participant. Generally speaking, I support that AtCoder should inprove the power of punishment for AI (such as block the account).
if there was prize money, there would be wayyy more cheaters
Actually some of ABC has some prizes for Japanese, so <1min ABCD have an advantage to earn them(though, to earn prizes, in most case contestants must solve F or G and in such difficulty ChatGPT isn't strong enough).
But please note, with current ruleset, they are not cheaters.
should we be scared of AI ?
The rule of ABC has been changed. Please see this post.
I think this is something great. I hope codeforces does this too, many people from Div3 and Div2 are passing solutions using GPT models too thesedays.
How are they even gonna check if a participant does it ?
Actually, it works only for too easy A and B type problems after that it fails, so we need not to fear I know cheating is a concern by cheater can never learn and grow if they keep cheating.
Finally, this problem has been solved. You're our hero.
But there is another problem, which is "cooperation among multiple people" in the annoucement. The announcement states that it's "impossible" to detect it, but we can do our best to prevent it, such as by improving the anti-cheating system. In fact, there're more cases where multiple participants participate in a contest and share code simultaneously than when multiple individuals participate with one account! I've seen many incidents of cheating on social media (QQ, private message of Luogu or something else), like this one (fortunately he didn't succeed):
Translation: "I want the solution of problem D! I cannot debug it, I'm crazy!!! And I also want solutions for problem E and F!!!"
Some people's tests indicate that AtCoder's anti-cheating system isn't perfect; some extremely similar codes won't be discovered (at least this kind of thing is highly unlikely to happen on CF; I can't disclose more details because I'm concerned that some people may imitate these cheating behaviors). Maybe it's necessary for AtCoder to solve this problem.
Deleted