In the last contest, both tourist and ko_osaga copied the author's solution from the problem Matrix from Petrozavodsk Winter Camp 2013 Warsaw U contest: 123723552 and 123727448 respectively. You can easily verify that they didn't write this code themselves by looking at the comments in these solutions (written in Polish).
This is against the Codeforces rules. According to this post, we have the following rules:
Solutions and test generators can only use source code completely written by you, with the following two exceptions:
- the code was written and published/distributed before the start of the round,
- the code is generated using tools that were written and published/distributed before the start of the round.
and
Currently, the only reliable proof is the presence of code on the Internet and the presence of the used edition in the cache of well-known search engines.
For example, this rule accepts the use of the code from the website http://e-maxx.ru/ if the code was written and published/distributed before the start of the round. With the help of search engine caches, it can be easily shown that such code doesn't violate the rules. Similarly, it is permissible to use the code from a book/article that was published before the contest. On the other hand, using team reference code (for example, prepared for ACM-ICPC World Finals) is not allowed if there is no reliable and objective way to prove that the code was written before the contest.
This code is not available on the Internet publicly and was never published anywhere (at least I am not aware of that). There are two ways to get access to this code: either ask the author or have admin access to one of the contests where this problem is used. In my opinion, this falls into the second category (same as team reference code), as only a small number of people have access to this. Moreover, imagine if a similar issue would occur to some unknown, green or grey contestants. Their solutions for sure would be skipped. Please stop this elitism and ratism and treat all the contestants equally.
I am not saying that we should ban them or skip their solutions, but we should make sure that the rules are clear and fair and there are no exceptions. Maybe if you are using a third-party code, you need to add a comment mentioning the original source of the problem and this solution doesn't have to be widely available, but can be verified upon the request of someone investigating this potential cheating case? If that's the rule that we currently use, please add this to the aforementioned blog.
I wonder what happens if you use a code from a book/article that is not published on the internet.
I think it's more about having some reliable and easy to verify proof that the code really existed before the contest. Rather than shared by one contestant with the other during the contest itself.
The "team reference code" example is a somewhat special case. Because the whole team is suspected of violating the rules and their testimony isn't reliable.
The whole problem can be easily resolved by something like https://en.wikipedia.org/wiki/Trusted_timestamping
And here is a more concrete example of trusted timestamping. Based on a question from your comment:
If the book exists in a digital format, then you can calculate sha256 hash of this book and post it in some comment on the codeforces platform before the contest. If Mike suspects you of plagiarism, then you send him a copy of this book and he can verify its sha256 hash.
And in a similar way, if tourist has a large collection of code templates, then he can calculate sha256 hash of each file. Then calculate sha256 hash of the list of hashes. And post the final hash in some comment before the contest. If Mike suspects tourist of plagiarism, then tourist can provide the list of hashes and that particular source file which was used. Mike can confirm that this file really existed before the contest.
Abusing codeforces comments to post all these hashes by every participant before every contest won't look pretty. So I suggest to simply add a field in the contest registration form, where this hash can be optionally added. In the case of a plagiarism dispute, anyone can prove that their secret template code really existed before the contest. Those, who are only ever using public sources from the Internet, have no need to do anything special.
We have more than 20.000 pariticipants in every contests... Do you really think that yours method with sha256 hash will work on codeforces? Imagine 20.000 people sending sha256 hashes before contest... And imagine Mike checking this hashes from every user... It is literally impossible!
And "a large collection of code templates". Do you think that the codes are pretty stored in the folders? I think most commonly they are shattered all around, and sometimes it is hard to find previous tasks, that you once solved.
Do you imagine any problem with that? These 20.000 people successfully register for contests using their browsers and nothing breaks. A sha256 hash is something that looks like this: c591d9e521af717ff9cc97a908c72d12f6fab4db0560ae81ec0d10ad78bb3b31 (32 bytes in a binary form or 64 characters in ascii form). That's just a tiny bit of extra information and a barely noticeable increase of web traffic.
Mike already checks appeals of suspected cheaters and their Internet links to third-party code. This is a part of the current process. Confirming hashes of files can be done in a fully automated fashion without any human interaction at all. Mike's job becomes easier.
Then it's a good motivation for you to clean your mess. Or do nothing and hope that the plagiarism checker never flags you.
If tourist has a large collection of code samples, then as per the rules reiterated in the blog, if he wishes to use them in contests he must have written and published them before the contests begin. Please do not encourage unpublished pre-written code, especially when it is clearly against the rules.
As for the book issue, I highly doubt there is any significant example of an algorithm implementation available on a paperback book and not available on one of the hundreds of (competitive programming or not) libraries/repositories on github, so I think it's a pointless hypothetical.
Your emphasis is on the word "published/distributed". But please note that the word "publicly" is not present in the rules. The distribution could have happened privately.
My emphasis is on "... not allowed if there is no reliable and objective way to prove that the code was written before the contest" part of the rules. It's all about having a reliable proof. And as you may guess, I fully agree with -is-this-fft-'s comment.
The source code of all submissions becomes public after the contest ends anyway. Isn't this all that matters? I think that people deserve to have a little bit of privacy. Their collections may also contain some unfinished code, which is too embarrassing to publish prematurely. But copying/reusing parts of such unfinished code may be still useful to solve problems in contests.
I don't think that it's against the rules. It may be against your interpretation of the rules, but my interpretation of the rules is different.
Interesting, I am ready with my popcorn.
like who cares man , it is not like if you have these codes in the hand of a green or blue or purple dudes and they are able to solve the problem, it is like you give a 10 years old kid a car, you have to become old enough to be able to ride a car
beside if you are not clever enough to design a good problem for the last problem it's your fault that the idea was used before, i wonder if the problem copied from that problem.
i do not understand what you mean exactly , it's like to tell a person use "X" lemma to solve "Y" problem , it's not enough for most of average people , let alone a hard problem. people here cant use tutorials to write complete solution and you expect by an unmodified code they can get ac?
also i am pretty sure a LGM can look at those codes and write their own codes very easily, what if they did that and this blog did not exists? they just save some time by copy paste method , why do not you focus on the problem that the last problem was not right problem for the hardest question?
imo modified ideas to solve a problem is not a big deal we have more sirous problems like shared solutions on a youtube channel rather two LGM saving time by copy-paste which i am pretty sure they could write it them selves.
I'm not saying I know for sure it's a rules violation, I'll leave that to headquarters. But I am saying the issue shouldn't be dismissed just because "LGMs are super orz anyway, so who cares?" Definitely having private access to that code gives them a significant time advantage, even if they could implement it themselves.
Yes, this gave them some advantage in just one contest. But now this solution is public and anyone can download it, read the editorial and learn from it. A private algorithm from 2013 became public in 2021. This is a step forward for competitive programming and computer science. Everyone wins.
If you are in favour of making it clearly illegal to take advantage of any private information in any form, then you are basically in favour of destroying some business opportunities in competitive programming:
See the recent Sports programming, the book? blog post written by ... wait for it ... kostka. If people could get into a trouble by buying his book and then copying one of the advanced algorithms from it during a contest, then what's the point? And if the book is available on the Internet for free, then the book author is losing money.
Seminars or training camps with paid participation are additional examples of business opportunities. People paying for them may get exclusive access to something and it's one of the perks.
Proof of existence of a certain private document at a certain date in the past without disclosing this document prematurely is a well known solved problem. Contest rules could be clarified by MikeMirzayanov to allow accepting such proofs. I'm even getting downvotes in another comment for being such an annoying KO in a community of people, who are passionate about algorithms.
Sure. But that's not the point he's trying to say.
Aight imma head out. Is this even relevant?
He said that "the issue shouldn't be dismissed", but he also delegated all the decision making responsibility to the codeforces administration. Yes, he also said that "having private access to that code gives them a significant time advantage", but it's a bit vague and he didn't say whether allowing this is good or bad in his opinion. That's a very cautious diplomatic statement.
I suspect that the codeforces administration doesn't want to make any decisions either. The optimal outcome for them is to keep things the way they are. No commitments means less work for them. And also no risk to accidentally offend any party.
I just shared my opinion about the use of private code. Yes, my opinion doesn't matter much. And as Monogon didn't reply, the discussion died off. Do you have any opinion of your own?
After I thought more, I agree that it should not be a rules violation, but the rules should be revised to be more clear about what counts as proof of the code existing before.
It seems like you made this account only to write this comment....or maybe you are the Chosen One!!
Shots Fired!! Codeforces Blogs are more interesting than FB and Insta these days.
I wondered how tourist solved problem I after only 14 minutes
He's just better than us.
in 6 minutes, he solved A-D in first 8 minutes.
We went from reporting newbies to experts to masters and now legendary grandmasters. What's next? MikeMirzayanov?
Aight, I am gonna keep coming back to this blog every hour XD
To my mind, the more interesting is the fact that the hardest task of the contest appeared before. I think that it is much worse, than copying a solution.
yup that is what i said too.
Well,this is pretty normal situation. There is no way that contest writers know every written problem in history and if testers also don't know any similar problems,they publish it.I also think this is not a good situation but it can't be avoided since amount of testers are limited
meret <- author of code
Interesting to see how tourist and ko_osaga would respond to this accusation.
What exactly does "use" mean here? Does it mean the exact copy? I wonder if it would be acceptable for a contestant to write their "own version" of the code by looking at the solution so it's completely written by them.
Completely agree. In such cases people can wait ten minutes and then say "I have written this code by myself", but in reality they will just copy their private solution to the problem and slightly change it.
The only way to stop it is to try not to give old tasks.
I don't think this is like the team reference code case at all.
Note the "if". I'm sure that if the plagiarism checker flags someone for using a team reference code and they show a GitHub link of the reference that shows that this code did in fact exist before the start of the contest, they will not be disqualified from the contest. In fact, I'm sure that they would be acquitted even if the GitHub repo was private before the contest. I think this example is there to prevent cases where teammates just promise that this code was in their reference book, with no proof at all.
As I interpret these rules (at least the spirit of them), they are not about availability, only about provability. The code does not (or at least should not) need to be available before the contest to the larger public, you must only be able to prove that it existed.
Now back to this case. It is true that this code probably doesn't appear in the cache of any search engine. But contest admin tools are in fact on the internet and I assume that they would have some trustworthy timestamps. I don't think it would be difficult at all for tourist or ko_osaga to prove their complete innocence, that is, to prove that this code did in fact exist in 2013.
The first sentence in the first paragraph of the quoted rules clearly states that if the contestant did not write the code during the contest, and thus was written before it, then it must have been published/distributed, implying it must have been public. So it is about availability, according to my interpretation of the rules.
Do you mean this?
The code was distributed before the start of the round. Just not to the general audience.
[EDIT] That was probably a bit rude, sorry.
I would argue that "the code was made available to some people but not necessarily accessible to all" is still in conflict with the quoted rule.
I also agree :) I don't think it's hard to prove its innocence, considering that Mike participated in the Russian CP community for a long time.
Over 9000!
From reporting newbies and pupils to reporting LGM's codeforces blogs have come a long way XD.
This situation is a very slippery slope as the rules are not clear and unambiguous about what published and distributed actually mean. Just read -is-this-fft-'s opinion for example (he argues the code need not be available to everyone at the start of the contest).
There is no need for any punishment here, just a clarification of the meaning of the rules. This whole "cache of well-known search engines" hand-waving is clearly an indicator not enough thought has been put into this subject.
For example: if my interpretation of the rules is correct — all pre-written code must have been available to everyone on e.g. a public book or github library — then I cannot make by personal cp library private. According to -is-this-fft-'s opinion I can, so which is it?
That's more or less the point of this blog. The rules are not clear and can be interpreted in many ways. I am asking for clarification.
It doesn't look like you're asking for clarification at all, you already called them cheaters.
That was just to grab attention. ^^
Mike should implement a report_users_submission section.From the newbies to the highest rated in codeforces, we now see cheating blogs.So I think, it's the prime need now to open this section.I have a dream, I wanna see codeforces blogs recent action section free from cheater blogs
You're just doing this for upvotes that's the comment.
You legit accused both of them of cheating in the blog title, do you think it's right to make such allegations where you yourself are asking for clarification?
Does that mean multiple participants having same 3rd party code won't lead to plagrism ? Just curious
It depends. If the code was made and published before the contest then it's allowed to be copied (the big ass templates at the start of some people's code are 3rd party and allowed). The problem on the blog is that in kostka's interpretation "published" means "out for the general public" and in that case tourist's doesn't follow that, making the rules ambiguous.
But if 2 people have the same code made during the contest for the contest then it is plagiarism, no question asked.
IMO, the blog wasn't necessary and the title of the blog is stupid.
Are you jealous of 3800?
.
I see no problem.
Rules against cheating are meant to stop people from copying implementations of solutions they haven't came up with themselves, and as a result have a high rating that says they did solve certain problems when in fact they didn't. Tourist is one of the greatest competitive programmers in the world. In contests where tens of thousands of dollars are on the line and the problems i assume (as you can see i'm not a grand master i probably won't even be able to understand the statement of the problem in question) are much more hard and require much more sophisticated techniques, he manages not only to find a solution but to find one faster than the top 10 strongest competitive programmers. This was a normal codeforces round with 20 t-shirts on the line, and if tourist got lazy and copied code, is that really cheating?? 99.99999% of people who are familiar with competitive programming will say that tourist is perfectly capable of solving the problem and implementing its solution. So, the matter that anti-cheating rules are meant to prevent is not even present in the case of tourist, and therefor cheating hasn't occurred, unless of course the writer of this blog can demonstrate that tourist copied the code because of his inability to implement his own solution. Obviously, in other cases of cheating in which the accused account is rated expert of pupil, etc, this doesn't have to be demonstrated because what else could prevent a weak programmer from solving a problem other than his inability to do so. In the case of tourist, there is a long record that simply disproves any accusation of cheating based on inability leveled at tourist. And why would he even cheat?? because of a t-shirt?? This claim is ridiculous. What's even more ridiculous is the guy who wrote this blog. Just look at his physiognomy, it screams sooooyyyyy. And this is exactly what this blog is. It's a giant pathetic tantrum thrown by a soy-faced moron who probably has unrelated problems with tourist, i assume Jealousy, and now he finally finds the perfect pretext to take a dig at tourist. In my opinion this is not worth anybody's time as it is nothing more than a troubled soul of a weak person seeking comfort and stability through launching an immature and pathetic social justice crusade for a ridiculous reason. Duhhh Duhhhh "iMAgIne iF a sIMilAr isSUe woUlD OcCuR tO soMe ..." Duhhh Duhhhh "ELitIsm" Duhhhhh. GET OVER IT DUDE IT'S NOT ELITISM IT'S NOT DISCRIMINATION No one sees any wrong doing because there isn't any. This guy wants NATO sending f-35's conducting airstrikes and U.N. committees investigating human rights violations because tourist copied code. LOL So ridiculous.
"copying implementations of solutions they haven't came up with themselves": that's exactly me
Woooooooow, cool~
popcorn prepared xDDDDDDD
Don't waste your time. Even we can see the author's name in tourists code. If tourist really wanted to cheat, atleast he would have removed the name of the author from his code.He didn't do that.Come on man.Don't suffer from inferiority complex. If you knew the concept beforehand, you would do the same.Rather blame the problem setter why would they set exactly the same problem.
Rather blame the problem setter why would they set exactly the same problem
You're completely right.
Even once benq himself said somewhere that he copied some problem's code exactly that was appeared before. So we should blame authors not participants. If I would've known this problem beforehand I would've done the same.
Why blame author, he gave me free rating
I think that you misunderstood the fck_cheater's comment. It's the other contestants, who blame authors for giving you free rating.
A similar thing happened at AtCoder Beginner Contest 211 very recently. Solutions for at least problems C and D could be just directly taken from geeksforgeeks with very minor cosmetic modifications. And I did the same trick as you and tourist. It doesn't take to be a legendary grandmaster to encounter such situation.
and free contribution
once my solution's few lines got matched with someone else's solution so they skipped my code and gave me a warning and took my rating for that, but in this case their whole code is same from to bottom . I am not asking Mike to do anything but atleast treat everyone equally .
Hey kostka did you see VivaciousAubergine Last problem solution of Codeforce Global Round 15 ?? They just copy the code from[submission:8542794] from here.
.
Your contest submissions looks sus.
Have you also seen the first line of that code...
just look it up, too lazy to take a screenshot
Now when will we have the diss track ?
my god they remembered the question of the contest which was 8 years ago :-O
Benq after reading this:
PS: Just a meme no offense :)
XD
meanwhile tourist
ROFL, I was absolutely sure that he makes fun of gray users' blogs about cheating
I don't understand why is this blog getting so many upvotes and attention. ko_osaga mentions in this comment that he downloaded the code from OpenCup login.
It can be easily shown that this code was present on the internet (OpenCup website) since 2013. What's the big deal here and why so much fuss? Imagine being tourist and seeing this stupid blog post.
I'm interested in trying to verify this. Could you please provide step by step instructions how to download this particular code from the OpenCup website?
Dude, you should chill about this whole "prove that it was published before the contest" thing. One thing is that the OP is perfectly aware of where this code comes from, I think you can trust him in this regard. Secondly, the original author is clearly stated in the header and he didn't took part in the contest — thus he couldn't have cooperated with our suspects in this regard. And thirdly, how on earth could anybody write like 200 lines of code in 6 minutes, including reading and comprehending the problem statement xd Get real, it must've been previously published and tourist and ko_osaga were able to find it (maybe they studied problems from that olympiad).
I also think that it's unfortunate that such a situation had at all happened but blaming them for knowing how to find the solution to this problem is unreasonable.
Only people who attended that camp have the codes
You should have opencup/opentrains (the same thing) credentials. After you vc a contest on opentrains, you are given access to the solution sources and occasionally an editorial. Maybe it is possible to take the code without vcing (waiting for 5 hours), but I don't know how.
tourist participated in the camp, so he had the code from there, and ko_osaga vced it a couple of years ago.
Thanks! That's exactly what I wanted to know. I tried to check further and see that opencup/opentrains credentials can't be obtained by just anyone, or at least that's how it was 6 years ago.
Now if access to this code is supposed to be available only to a limited group of authorized people, then what does snarknews think about tourist and ko_osaga effectively "leaking" this code into free access for the whole world via submitting it as a codeforces contest solution? And if this is okay, than could anyone with opencup/opentrains credentials just publish solutions for all opencup problems on some website? Thanks!
moral of the story : plag checker works only on problem A and B
how to delete above comment ?
...
Why were these two submissions not skipped in system testing if they were identical?
These submissions clearly explain the origin of the code in the boilerplate comments and give credit to its original author. Plagiarism by definition is an attempt to misrepresent somebody's else work as your own. But using somebody's work without violating any rules is okay. Most likely a human still has to approve the results of the automatic plagiarism checker before any action is taken. And the boilerplate comments greatly helped to understand the situation.
This is a lesson to all of us: when using any third-party code from some website, it's a good idea to add a comment in your submission explaining where it was taken from. Trying to strip comments or obfuscate third-party code is a bad idea, because you only waste time and unnecessarily make yourself look suspicious.
Unfortunately they copied the same solution, so they can be found out easily. But if they copy different solutions, what will happen then?
Newbies in the comment section are just comparing themselves to Tourist lol.
I think it's better to remove the problem instead or Sir Mirzayanov announce something about his decision for this problem.
From the comments below, it looks like half of the people haven't read the entire blog lol
Jealous fuck, I wonder why this stupid blog is getting so many upvotes.
I think this blog should be removed as this blog is just to malign the top LGM's.
What about contestant C137... his submissions are skipped but the rating he gained is still in his account.. what about that ?
As far as I know, snarknews and other organizers from Petrozavodsk always provided CD disk with all submissions after the camp. For example, I have the same one for Winter2014/2015/2016 when I was participated.
Also, several years ago someone distributed such archives from Petrozavodsk on torrent.
And tourist was there: http://karelia.snarknews.info/index.cgi?data=2013w/teams&menu=index&head=index&class=2013w&sbname=2013w
So I see no problem at all. It was distributed at least for Petrozavodsk + Izhevsk (mirror of PTZ) camps participants.
Blog should me renamed as "Flaw/loophole in third party code policy" instead of cheaters report.
Exactly, imagine what newcomers/inexperienced participants would think when they see two of the best CPers being called cheaters.
I second this, they are seasoned programmers and they did credit their codes properly.
Calling them cheaters is blatantly wrong!
is there any policy that punishes for calling people "cheaters" solely based on opinion? If no, i would add one
cheater
jqdai0815 managed to get first place in Codeforces Round #621 (Div. 1 + Div. 2), And this it was his last Participation on Feb/17/2020.
He solved the last problem in 11 minutes the same way, This is documented in this video from minute 24 to minute 35.
Why did you tag him unnecessarily?
These solutions are also suspiciously similar: 123740161 (VivaciousAubergine) 123742691 (sunset)
Bro, they even put the link of the submission they used in their solution , i.e it was there previously before the starting of the contest, which doesnt go against the guidelines as far as i know.
Thanks for this very interesting observation! Two contestants submitted solutions, reusing code from exactly the same old solution 8542794 with only a 8 minutes time gap between them (01:02 vs. 01:10). And they are both "from same city in china" as the other commenter noticed. An additional minor detail is that the submission from peehs_moorhsum contains a "// stop creating problems" comment, which is likely to indicate that this contestant had some sort of an emotional reaction and maybe had a brief moment of not thinking rationally.
As everyone already knows, the old E. Matrix problem was indeed reused in the new contest and the participants could notice that. But I don't see submission 8542794 specifically mentioned in the editorial or comments below it. There were more than 150 accepted submissions available for this problem. So it's very interesting to know what makes 8542794 stand out among them and how likely it was to be selected by two different contestants independently?
Also it would be great if peehs_moorhsum and sunset could explicitly confirm whether they know each other and whether they did or didn't communicate during the contest.
imagine,what would happen if newbie posted this
-1500 votes
By the way, what are the rules for code reuse in Code Jam ?
The rules are very liberal and far less ambiguous, the code doesn't need to be previously public for codejam nor kickstart, see rules. For hashcode it shouldn't even need to be open-source as they allow paid software.
Tourist: Upsolving is worthless as you will never get the same problem again in a contest.
Also Tourist:
Ok, now without clickbaits and sarcasm.
I think the community mostly agrees that copying code (fragments or whole submissions) from third-party sources written before the contest is fine, even if they are not public on the wide-spread Internet. Can we have a confirmation from staff that this is indeed the case, and can we then reflect that in the official rules (i.e. make them more clear)? MikeMirzayanov?
Don't see it as tourist's or ko_osaga's fault, given they copied the code with all the comments, leaving no doubts as to its origin. Why couldn't the problem be changed at least a tiny little bit to make sure the original solution cannot be copy-pasted ?