Блог пользователя cjj490168650

Автор cjj490168650, история, 19 месяцев назад, По-английски

Original post: 「正义可以迟来但不能缺席」:关于 NXIST 的一些新证据
Translated by GPT-4 with some adjustment. Please inform me if there are any mistakes.
All links to the invalid repository have been redirected to the backup repository.

This article provides a logically complete set of evidence, which does not involve any non-public internet resources, regarding the "suspected cheating" incident involving the ICPC Yinchuan Station and ICPC Shenyang Station in 2021. By discovering the suspected GitHub account (NaokiLH, renamed to https://github.com/brokenTarget) of a team member from Ningxia Institute of Science and Technology (NXIST) TS 1 team, Lan Hao, two years ago, and by mining and analyzing the commit records of his algorithm competition repo, we have obtained direct evidence that at least 4 questions from the 2021 Yinchuan regional contest set and at least 6 questions (including scrapped questions) from the 2021 Shenyang regional contest set were leaked to him at least one week before the competition. The substantial amount of new public evidence indicates that the TS 1 team indeed cheated, and the relevant students were heavily involved.

This is the first direct evidence related to the incident after several years of discussion. This article is based on the mining and analysis of publicly available online information by @lucas110550 and @曾耀辉, and all the evidence provided does not involve any infringement or violation of relevant regulations. At the same time, @陈靖邦 conducted overall coordination and review. We welcome everyone to report and supervise.

Considering that the vast majority of the evidence comes from the commit history of NaokiLH (suspected account of Lan Hao)'s GitHub repo, to prevent the person involved from deleting and fleeing after this article is published, we strongly suggest everyone fork the corresponding repo to permanently keep this record.

https://github.com/NaokiLH/algorithm_trans

UPD: The original repo has been deleted, those interested can move to the personal backup repo:

https://github.com/NXIST-backup/algorithm_trans

Background Information

How to evaluate the ICPC Yinchuan Competition in 2021?

How to view the cheating controversy in the 2021 ICPC Yinchuan Station and the publication of outstanding team members winning gold medals by Ningxia Institute of Science and Technology on its official WeChat account?

How to view Ningxia Institute of Science and Technology winning one gold and one silver in the 2021 ICPC Yinchuan competition?

PDS Plagiarism Detection System Example: A demonstration of plagiarism detection in a certain competition

https://weibo.com/u/7535856183

Main Content

Recently, NXIST announced the hosting of the 2023 Silk Road China Invitational.

How to evaluate the 2023 International Collegiate Programming Contest (ICPC) Silk Road China Invitational?

Upon learning of this, I was not only shocked but also deeply saddened: What is the purpose of doing such things?

So, on a leisurely afternoon, I began to search the internet for information about the award-winning team members, namely the TS 1 team members: Lan Hao, Ni Binqi, and Zhou Jianing. In an inconspicuous corner of GitHub, I found a homework submission repository for "Geek University's Python Advanced Training Camp — 1st Term" with the same name as one of the parties involved:

Week08 Homework Link Collection · Issue #52 · Python001-class01/Python001-class01

The information submitted by user upupqi

The information submitted by user upupqi contains the name of one of the parties involved, Ni Binqi. This led me to the user upupqi's profile directly:

User upupqi's GitHub profile

Of course, we can't directly conclude that this is the person in question (after all, there are many people with the same name). After some investigation (such as his algorithm competition repository, confirming that he is also an algorithm competition participant), we obtained a very strong piece of evidence (and the source of this article): his mutual follower NaokiLH, an ID that is suspected to point to another party involved, Lan Hao (LH).

upupqi and NaokiLH's follower interface

In an early issue raised by NaokiLH's account, there is a screenshot of his computer interface, where we can find a "Lan Hao 45418016" compressed file, which preliminarily confirms that the owner of this account is also named Lan Hao.

NaokiLH's screenshot of his computer desktop in a GitHub issue

Now that we have the GitHub accounts of the two parties involved, curiosity drove me to dig through their GitHub repos to see if there was anything interesting. The first conclusions were: 1. Both of them are not very proficient in using GitHub (including upupqi not knowing how to inherit repositories, and NaokiLH's commits being very messy and not compliant, criticism is raised here) 2. Both of their algorithm levels are not very high (both of their algorithm repos had only learned some basic things before May 2021, and upupqi was still in the AcWing training camp in August 2021. Most of their Codeforces VPs are only at the Div.2 AB level), and it is hard to imagine how such a team could win a gold medal in the regional competition.

What really gave birth to this article was NaokiLH's algorithm competition repo:

https://github.com/NXIST-backup/algorithm_trans

It seems quite normal, nothing strange.

Going directly to the commit records during May 2021, I found some interesting things:

NaokiLH's git commits in May 2021

The commits are very casual, and I raise criticism. Let's first look at the commit "3123131" at the bottom, which occurred on May 10, 2021: 3123131 · NaokiLH/algorithm_trans@03efcf1

I found that NaokiLH created a new folder called "yinchuan" and uploaded codes for problems B, G, and I:

NaokiLH's commit records on May 10, 2021

Then in the commit "423423" on May 13, 2021: 423423 · NaokiLH/algorithm_trans@76bd49e

NaokiLH uploaded the code for problem K:

NaokiLH's commit records on May 13, 2021

Let's carefully compare these four pieces of code with the official competition problems of the 2020 Yinchuan:

Ref:

We can see that, apart from the correctness of these four pieces of code, their input and output, as well as some variable names, can be completely matched with the problem statement. By submitting (interested readers can verify themselves), two of these four pieces of code can only pass the sample cases, while the other two cannot even pass the sample cases.

So when did the official competition of the 2020 Yinchuan take place? May 16, 2021.

Title of the 2020 Yinchuan problem statement (held on May 16, 2021)

In other words, NaokiLH (suspected account of Lan Hao) had already obtained enough information on May 10 and May 13 (one week before the competition) to complete the initial codes for problems B, G, I, and K, which were supposed to take place in the official competition on May 16. The input and output matched the problem statement, and some of the code could already pass the sample cases. We can reasonably suspect that the problem statement was leaked one week before the competition, and Lan Hao, as a party involved, was already aware of it and heavily involved.

In the subsequent official competition, he passed problems A, B, E, G, J, and K, among which B, G, and K are highly suspicious problems derived from the above investigation. Problems B and J have been mentioned in Dai@NeverLand: PDS Plagiarism Detection System Example: Plagiarism Demonstration of a Certain Competition with code overlap.

Final leaderboard of the 2020 Yinchuan (held on May 16, 2021)

On May 22, 2021 (one week after the competition), the uploaded code was deleted by NaokiLH with a recorded commit: 321312 · NaokiLH/algorithm_trans@80c1103

NaokiLH's commit records on May 22, 2021

After May 22, everything returned to normal. NaokiLH began learning KMP and participating in AcWing training.

In-depth review, let's speculate on the situation at the time according to the timeline:

Early May, NaokiLH obtains the leaked problem statements, which include at least Problems B, G, I, and K. However, the leak only contains problem statements and examples, not solutions or standard inputs and outputs.

May 10, NaokiLH, through research or seeking help from others, writes the code for Problems B, G, and I. However, given their skill level, they cannot guarantee the correctness of these three pieces of code. NaokiLH thinks for a while and decides to upload the code to GitHub as a backup.

May 13, NaokiLH completes the code for Problem K and uploads it to GitHub as a backup.

May 16, The Yinchuan Regional Competition officially begins. TS 1 team tries (perhaps?) to submit the pre-written code without success. They then obtain the passing code from other teams through some means provided by the organizer and submit it to achieve AC (Accepted), ultimately winning the gold medal. Public opinions start to form.

May 22, NaokiLH deletes the code for Problems B, G, I, and K from the GitHub repo.

Bonus Content

2020 Shenyang Regional Contest: How Do TS 1 Team Prove Themselves?

Background Information

Translation: We have already talked to quailty that we will participate in Shenyang.

Translation: Now, the competitions have all come to an end, and they have returned to their college life, doing the same things they have always done, over and over again. "There's nothing to be proud of" is the phrase that appears most often in their conversations. While others are still immersed in their last victory, they have already started preparing for the next competition. (From the NXIST public account)

Explanation of the Leak of the 2020-2021 Shenyang Contest Abandoned Questions

Video at 1 minute 29 seconds: The competition time for the Shenyang station was postponed from the original May 23 to July 18.

Content

May 21, One week after the end of the Yinchuan competition, NaokiLH makes a new round of commits: 88888 · NaokiLH/algorithm_trans@7e35b60. They create a new directory under the original repo called ICPC/shenyang and upload A.cpp. The next day, May 22, NaokiLH creates another directory called blue_book/sh and moves the original A.cpp from ICPC/shenyang to this new directory: 321312 · NaokiLH/algorithm_trans@80c1103.

NaokiLH's commit records on May 21, 2021

NaokiLH's commit records on May 22, 2021

May 23, NaokiLH uploads B.cpp to the blue_book/sh directory: 4324324 · NaokiLH/algorithm_trans@1f8b5a2

NaokiLH's commit records on May 23, 2021

May 24, NaokiLH uploads F.cpp and H.cpp to the blue_book/sh directory: 3123123 · NaokiLH/algorithm_trans@d765bf7

NaokiLH's commit records on May 24, 2021

By June 11, NaokiLH had made modifications and ultimately completed the changes to the code in the blue_book/sh directory. Here is the final version of the directory at that time (including code for Problems A, B, F, and H).

We can easily find that the code for Problems A, B, F, and H does not match the problem statements of the Shenyang Regional Contest, and the clues seem to be disconnected. What went wrong? As it turns out, this situation is closely related to the Shenyang Regional Contest's scrapped problem event (see earlier references):

  • A.cpp actually corresponds to a problem called "jailbreak" in the scrapped Shenyang Regional Contest. As of the time of writing, this problem has not been publicly used. However, due to the passage of time, the scrapped problem PDF has been lost. Here, we provide only the relevant information and preview of the problem statement:

information

The problem was based on the Polygon platform for the question-making process, and the last edit was made on 2021-05-16 11:30:53 (UTC time).

Overview of the problem statement for the scrapped Shenyang Regional Contest problem 'jailbreak'

Overview of the problem statement for the scrapped Shenyang Regional Contest problem 'jailbreak'

We can see that the input method of this code is completely consistent with the original problem, but the output does not match: the problem requires the output to be "yes" or "no", while the output in the code is "YES" or "NO" (with an additional line of information). This actually corresponds to subsequent modifications to the problem, although this code still cannot pass the problem:

Jailbreak problem version modification records

Jailbreak problem version modification records

  • B.cpp can correspond to Problem H of the Shenyang Regional Contest 103202H - The Boomsday Project. The input and output methods are completely consistent, and it can pass the example cases. However, due to the adjustment of the data range in subsequent versions, it cannot pass all data. Interested students can compare it themselves.
  • F.cpp can correspond to Problem J of the Shenyang Regional Contest 103202J - Descent of Dragons. The input and output methods are completely consistent. This code can pass only some of the data besides the example cases.
  • H.cpp can correspond to the 2021 NowCoder Summer Multi-School Training Camp 8: F. Robots. We need to explain that this problem was once one of the scrapped problems of the Shenyang Regional Contest and was later used in the 2021 NowCoder Summer Multi-School Training Camp 8 held on August 9, 2021. Prior to that, it had not been publicly used. As it is a paid competition that requires registration to view the problems, we also provide a preview of the problem statement here:

Overview of the scrapped Shenyang Regional Contest problem 'robots'

We can see that the input method of this code is completely consistent with the original problem, but the output does not match: the problem requires the output to be "yes" or "no", while the output in the code is "Y" or "N". Interestingly, this coincides with subsequent modifications to the problem (shown below). By aligning the output methods, the code can also pass some of the official data besides the example cases.

Robots problem version modification records

Robots problem version modification records

A brief recap:

From May 21, 2021, to June 11, 2021, all of NaokiLH's commits related to the Shenyang site were impossible to complete without leaking the problems. The submitted codes correspond to some of the problems from the Shenyang Regional Contest (July 2021) or some of the scrapped problems from the official contest. Some of the scrapped problems only appeared in the August 2021 NowCoder multi-school contest, and some problems like Jailbreak have not appeared to this day.

After June 11, NaokiLH continued to study LeetCode and regularly checked in with AcWing. Until the Shenyang Regional Contest offline competition on July 18, no other suspicious commits appeared.

On July 18, the 2020 ICPC Shenyang Regional Contest offline competition officially began. Team TS 1 ultimately won the silver medal. The second wave of public opinion began.

https://board.xcpcio.com/icpc/2020/shenyang

The final leaderboard of the 2020 Shenyang contest (held on July 18, 2021)

Is this the end of the story? Not quite.

On July 23 (one week after the Shenyang contest), NaokiLH made a mysterious commit: 423423 · NaokiLH/algorithm_trans@1fb8f50. In this commit, we can see that the original A, B, F, and H codes under the blue_book/sh directory have been deleted and replaced with some codes prefixed with "tempo."

NaokiLH's commit record on July 23, 2021

These codes have disorganized names. After analysis, we found some corresponding relationships:

  • tempo1.cpp corresponds to the scrapped Shenyang contest problem Jailbreak, which has not been used to date.
  • tempo2.cpp and tempo3.cpp correspond to the 2020 Shenyang contest problem K. Scholomance Academy 103202K - Scholomance Academy, and the code styles and implementations are completely different.
  • tempo4.cpp, tempo5.cpp, and tempo6.cpp all correspond to the 2020 Shenyang contest problem H. The Boomsday Project 103202H - The Boomsday Project. These three codes are completely different. Among them, tempo4 corresponds to the version before the data range adjustment, and tempo5 and tempo6 correspond to the version after the data range adjustment. Both tempo5 and tempo6 can pass the sample cases. It is worth noting that tempo5.cpp outputs wrong answers on some data sets, while tempo6.cpp can pass all the data.
  • tempo7.cpp corresponds to the 2021 NowCoder Summer Multi-School Training Camp 8: H. Scholomance Academy. This problem was also one of the scrapped problems of the Shenyang contest and was later used in the 2021 NowCoder Summer Multi-School Training Camp 8 on August 9, 2021. Since it is a paid competition that requires registration to view the problems, we also provide a preview of the problem statement here:

Shenyang contest scrapped problem Scholomance Academy overview

We can see that the input and output methods are consistent, and it can only pass the sample cases. The code seems to be created to match the samples.

  • tempo8.cpp corresponds to the 2021 NowCoder Summer Multi-School Training Camp 8: B. Dohna Dohna. This problem was also one of the scrapped problems of the Shenyang contest and was later used in the 2021 NowCoder Summer Multi-School Training Camp 8 on August 9, 2021. Similarly, since it is a paid competition that requires registration to view the problems, we also provide a preview of the problem statement here:

Shenyang contest scrapped problem Dohna Dohna overview

Shenyang contest scrapped problem Dohna Dohna overview

We can see that the code and problem input and output methods are consistent, and it can pass some data besides the sample cases.

Although the commit 423423 · NaokiLH/algorithm_trans@1fb8f50 on July 23 is later than the 2020 Shenyang contest date of July 18, we can still reasonably raise the following questions:

  • Why do tempo1, tempo7, tempo8, and tempo9 correspond to scrapped problems of the Shenyang contest (which have not appeared in the official contest)?
  • Why do tempo2 and tempo3, as well as tempo4, tempo5, and tempo6, show different code styles for the same problem?

On July 29, NaokiLH deleted all the codes under the blue_book/sh directory: 12313 · NaokiLH/algorithm_trans@9e7e573

NaokiLH's commit record on July 29, 2021

After July 29, everything returned to normal. NaokiLH started learning Kruskal's algorithm and participated in LeetCode training. This concludes the main evidence presentation in this article.

According to the timeline, let's make a deep analysis and speculate on the situation of the Shenyang incident:

In May, NaokiLH obtained the leaked scrapped Shenyang problems, which at least included Robots, Dohna Dohna, Scholomance Academy, and Jailbreak, four problems that did not appear in the official contest, as well as problem H 103202H - The Boomsday Project and J 103202J - Descent of Dragons from the official Shenyang contest. The leaked content only contained problem statements and examples, without solutions and standard codes.

From May 21 to June 11, NaokiLH obtained the codes for the mentioned problems through research, seeking help from others, or paying for ghostwriting. With his own skills and the possible negligence of the ghostwriters, the correctness of these codes could not be guaranteed. NaokiLH decided to partially upload and back up the codes on GitHub after some consideration.

On July 18, the Shenyang Regional Contest officially began. TS 1 team tried to submit the pre-written codes, obtained different styles of codes for problems H and K in advance, submitted and passed them to secure a silver medal. The second wave of public opinion began.

On July 23, NaokiLH decided to back up all the codes used on GitHub after some consideration.

On July 29, NaokiLH, who still felt something was wrong, deleted all Shenyang-related codes in the blue_book repo on GitHub.

Conclusion

Although there might be some deviation in the timeline estimation, based on this content, we have reason to believe that TS1 team, through some means, obtained a version of the problems before the Yinchuan and Shenyang contests in 2021, attempted to complete the codes in advance to ensure their results, and also used GitHub for data management. This is undoubtedly a very malicious cheating behavior in the competition, an insult to programming contests, an insult to the spirit of ICPC, and an insult to every programming enthusiast in the country.

We are well aware that at this point, even such strong evidence cannot change anything. After two years of silence, NXIST has started organizing competitions again. However, even so, we do not want this matter to be silenced. We believe that the Internet has a memory, and any conscientious contestant should not forget such an incident, engrave it as a deep humiliation in their hearts, and take action to resist such behavior, and resist those who commit these acts repeatedly and fearlessly. If anyone is willing to pursue legal or other means to sue or report, we are more than willing to help.

Полный текст и комментарии »

  • Проголосовать: нравится
  • +1018
  • Проголосовать: не нравится