generaleoley's blog

By generaleoley, history, 4 years ago, In English

I recently got access to Github's Copilot and it struck me that it could generate a lot of algorithms that someone might not know or even solve some of the simpler problems with ease. Despite this technology being very new, I think it's worth adding a rule restricting or prohibiting the use of certain AI code completers. From what I'm seeing in the development of this field, it's only a matter of years before someone trains an AI that can perform in coding contests at a LGM level.

Edit: To clarify, I don't believe it will be Copilot that will be trained at this level. LGM is an exaggeration but it's certainly possible for a Master level AI because of how much the easier problems are somewhat pattern recognition.

Examples to demonstrate the power of Copilot:

1. Finding Maximum Subarray
2. Find 2 elements in an array that equal to N
3. Coin Change

I couldn't really make it write out many advanced algorithms but this is for sure going to change in the future. So far I find it's the best with very simple trivial functions (check if number is a power of 2, convert string to integer, check for duplicates, find minimum etc.)

Personally, I'll be coding without copilot in competitive programming because I believe this to just be another form of cheating and doesn't help me get better. Regardless, I want to hear what the community thinks on this matter and if it should be banned or not.

  • Vote: I like it
  • +27
  • Vote: I do not like it

| Write comment?
»
4 years ago, # |
  Vote: I like it +1 Vote: I do not like it

Github's Copilot is amazing we can make right use of it to learn new algorithms.

»
4 years ago, # |
  Vote: I like it +27 Vote: I do not like it

What really matters is that you should be able to reduce the problem to something which you can solve. So if a problem reduces to finding maximum subarray, without copilot also you are allowed to use internet. I believe using copilot won't help with solving CF problems, it might be able to solve some well known problems.

»
4 years ago, # |
  Vote: I like it +66 Vote: I do not like it

Keeping aside all the licensing issues that arise out of the use of Copilot (which is a severe handicap that arises while using this for coding competitions, since on most judges, code is free to be used elsewhere, and Copilot apparently doesn't care enough about the license of the code that it copies the snippet from in its "generative" model to provide proper attribution), here's what I think about it:

  1. From what I've seen, it seems that Copilot was trained on Leetcode submissions as well, so it means the code that's generated is essentially copy-pasted from its corpus, and there is nothing really "intelligent" about it in the sense of competitive programming (though it is still a great tool from a ML perspective). If you know what you want to implement, then you might as well Google it, or have it in your own code library, so that's not really something that changes with Copilot.

  2. There were some projects that worked on a much harder problem (for instance, this), and it seems like there's still a long way to go before machines can compete without copy-pasting code. AI competing at an LGM level sounds almost impossible at the moment, since competitive programming requires comprehension, ad hoc analysis of the problem, and general problem solving, so if there does exist a solution for this subproblem of the AGI problem, then the jump to achieving AGI from that point should intuitively be smaller. The reason why game-playing is so conveniently treated using AI is that game-playing has a very definite structure. One could argue that someone could list down all ideas that they have ever used in a CP problem, and then train an AI using that, but well, I don't see any such thing of this sort out there :).

All-in-all, I feel that AI code-complete is mostly a more convenient (but legally dangerous) form of copy-paste, and it doesn't make sense to add a new rule for it. However, I believe users of the service should be well-aware of the potential licensing pitfalls of the code that's generated, and that should be fine.

»
4 years ago, # |
  Vote: I like it +37 Vote: I do not like it

The problems you've listed are classical so it's just regurgitating something it has already seen. So it's just an overglorified google search except the code it generates isn't even correct and when it is, it's still in the style of users who are gray/green (because that's is what is represented most in the training data).

it's only a matter of years before someone trains an AI that can perform in coding contests at a LGM level.

Great, if that day comes we can all quit programming and become product managers.

»
4 years ago, # |
  Vote: I like it +3 Vote: I do not like it

Shahraaz You cheamter XD XD

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +65 Vote: I do not like it

    CoPilot trains on public code. According to Rule for third party code on codeforces ; we can use

    1. the code was written and published/distributed before the start of the round,
    2. the code is generated using tools that were written and published/distributed before the start of the round.

    So, in my opinion, CoPilot is legal; In case it is not I would like to get a confirmation from Codeforces Headquater.

    Also, Github CoPilot ain't that great and at all. My CPP snippets already cover the most common algorithms and snippets are faster than the CoPilot. I tried using CoPilot for non-trivial things(like LCA) but man it is dumb.

    Here is a meme describing my (& Gin chan's) opinion on CoPilot. Credits to pritishn

     Github Co-Pilot in a nutshell

»
4 years ago, # |
  Vote: I like it +9 Vote: I do not like it

Wait, the examples you provide to demonstrate the "power" of Copilot, was the AI told to "write an algorithm for finding maximum subarray" and it provided that code from scratch by actually solving that problem, or was it already fed with that data and it's just a more convenient form of Google Searching "finding maximum subarray".

If it's the second case, how did you conclude it can perform at an LGM level, because this AI can't "think". It can't even perform at grey level. Or even at chimpanzee level for that matter.

  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I personally was thinking that someone could feed all of codeforces problemset and a bunch of AC solutions and allow an AI to develop a sort of problem solving.

    Our current technology isn't yet at this stage but I'm confident something like this is to come.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +54 Vote: I do not like it

      If some entity is intelligent enough to solve CP problems at that level, cheating on CF will be the least of our worries xD.

»
4 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

I don't think that's going to make much of a difference. In contest, if I cannot solve a problem, there are one of two reasons

(a) I lack the requisite theory.

(b) I lacked the insight.

In every single contest I have taken, virtual & not virtual, it has never been because of (a). There are actually two exceptions, but in both those times, there was an alternative solution using elementary techniques, so I don't count those as 'true exceptions'.

What this really demonstrates is this: Codeforces relies heavily on insight: knowing the implementation of standard algorithms won't help. To give an example, take the most recent round (cf edu rnd 111). Neither A,B,C,D, or E could've been solved with github's copilot. Sure, some problems are more creative than others, but always, even if the question hinges on a hard algorithm, a hard part of the problem is deducing it to that algorithm. Github copilot cannot do that.

Now, there is a point to be made for people who are not well-versed in standard algorithms, and maybe for those people, copilot may be helpful. But these people are not the majority, nor will they ever make up a significant part of Codeforces.

Perhaps there is case to be made that in LeetCode, copilot may help decently much. I'm still not convinced about that either, but I can see how the point can be made.

Anyways, I'm still not sure that copilot will be a competitive advantage.

Just my two cents.


Even if my points above are not sufficient, Google exists and is allowed on Codeforces. For well known algorithms, a google search suffices, anyways.

»
4 years ago, # |
  Vote: I like it +10 Vote: I do not like it

it's only a matter of years before someone trains an AI that can perform in coding contests at a LGM level.

Github copilot is just a high-level auto complete by now, it is design for software engineering not competitive programming. It can solve some easy problems because someone has written them before.

There are some AI for solving competitive programming problems, but we don't need to worry someone will use them to cheat. Their accuracy & speed for solving algorithm problems is much lower then most human.

BTW, if an AI can create code by itself to solve hard codeforces problems (like 3000+), it will be also hard to check if some submissions are written by human.

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

To generally clarify my idea, I'm not saying that what it can do right now is very powerful. At most right now it can serve to make you more efficient and help you recall certain details of classical algorithms but I was considering the future implications that this could have, say should many developments come and have it start to parse the question and "solve" the problem which I feel strongly is to come in a not too distant future. This is the type of thing I'd probably think we'd see from GPT-6 or something but it doesn't seem impossibility out of reach.

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by generaleoley (previous revision, new revision, compare).

»
4 years ago, # |
  Vote: I like it +29 Vote: I do not like it

There are a lot of things that could happen in the future, no one can say. As of now though, I don't think Copilot is even a proof-of-concept of AI being able to 'solve' problems. It mostly just removes the 'google-choose-' steps from the 'google-choose-copy' workflow for finding pretty classical implementations. Especially in CP, I don't think it is very useful for people who have even non-beginner level of experience. I'd rather write my own implementation of such short codes than expect it to write and then tinker to match with the current implementation of the problem or my coding style.

If it's something more involved than finding max subarray sum (which it probably can't do at the moment) / something decent competitors would rather not write themselves — that's exactly why people maintain libraries with their style of coding.

»
4 years ago, # |
Rev. 2   Vote: I like it +3 Vote: I do not like it

Some Info

OpenAi Paper that describes testing of the model(Codex) that powers CoPilot.

Researchers have trained Codex and its descendants that power the Github CoPilot, on a competitive programming questions dataset. Known as APPS.

Download APPS dataset to know more.

»
4 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

I do not think it's cheating :)

»
4 years ago, # |
  Vote: I like it +9 Vote: I do not like it

What I have been wondering though: does Co-pilot learning your code when you're using it? So if at least two contestants are using Co-pilot in an active contest, one can "steal" a code from another one?

https://xkcd.com/2169/

»
3 years ago, # |
  Vote: I like it +10 Vote: I do not like it

Ha, this was just 7 months ago.