- Disclaimer : This article was originally written by zigui. I'm just translating original Korean article in English, but still, I totally agree with this article.
In 2017 IOI Day 1, there was one output-only task, "nowruz". When I first saw this problem in GA meeting, I couldn't see the input file (which I think is very important in output-only problem). I expected there will be various type of inputs, such as totally empty, almost empty, almost full, and so on. I could look at I/O dataset when the problem was revealed, and it was out of my expectation. Almost all of input cases are almost empty. It means that there was an important condition about input data, which was not mentioned in the problem description.
Score distributions of contestants in "nowruz" problem were very strange. It had no correlation with contestants' skill, previous records, other problems' score, or anything else. Students who got low score in this task said: "I thought about heuristic solution (which turned out to give 80+ partial score), but I thought that solution will give me low score." Actually, tudent with good mathematical intuition will never think that this "heuristic" solution whill give good score in general situation. So they didn't try, and got a low score. Definitely, it will not be an easy task building an algorithm performs well in general cases.
I think that output-only tasks are not appropriate problems for IOI. There are some reasons about it. First, output-only problems have little logical steps to solve tasks, and it's contradictive to IOI's goal, evaluating students' logical thinking ability. For example, I'll show you two easy problems:
- What is zigui 's Codeforces rating? 1) 1234 2) 2345 3) 2456 4) 2488
- 1 + 2 = ? 1) 1 2) 2 3) 3 4) 4
In first problem, there is no clue to solve this problem. If you don't know about zigui (maybe true, right?), it's all about 25% probability. In second problem, everybody knows that answer is 3).
These two problems all have one description and four possible answers, but they are very different in difficulty. The reason is simple. Second problem has logical steps to solve, but first one doesn't. Everyone knows that 1 + 2 = 3. It's well-known, can be proved, and based on mathematical knowledge that everybody knows. In first problem, no one will memorize about zigui 's rating, and it's impossible to get the answer through logical steps.
Although it's a bit exaggerated, I believe that output-only problems are not quite different from this case. You have to "find out" some important conditions. They are not in description, just in input files. There would be more conditions or not, but if you can find it, you will get high score. In solving problem, no mathematical proofs are required. To be even worse, it's better to not prove it, if you want to perform well in contest!
In case of "nowruz", there were no special description about limitation about K. It just said "at least one possible solution exists." There are no additional talk about conditions. For example, all input datas were composed of only one connected component, which was not the case in sample input. If you are writing papers about heuristic algorithms, you have to be very careful in time-complexity proof. You have to "prove" that this heuristic algorithms give almost correct result in almost every cases. But in "nowruz", it needed no mathematical analysis, because there were actually no conditions. Analyze what?
Second problem of output-only tasks is that the only way to solve task is trial-and-error. In science, logical steps are very important issue. It's not a science to just show how to do it, and prove by results. In output-only problems you can't find out the results before you actually code it, and it's very hard to mathematically prove it. This means contestants should "gamble" with their solutions, whether it will get a high score or not.
2013 IOI's "artclass" problem will be a good example. If you don't try, you can't predict results. (Translator notes: Art class is task like this, "Get an painting as input file, and find out what's the style of this painting, between modern art, landscapes, expressionist action paintings, and colour field paintings." This problem had a lot of issues back then, and current IOI syllabus explicitly exclude this kind of problems. you can see problem in this link : http://www.ioi2013.org/wp-content/uploads/tasks/day1/artclass/artclass.pdf) There was an 100 point solution which only uses G values (among RGB values). If you say to artists "This painting has less green color, so it seems like modern art!", they will think you an idiot. But if you are in output-only problems, it's perfectly fine idea. If you try it, you can get a score, even though it's correct solution or not.
I think "nowruz" is also the same case. It seems that they wanted many high scores (possibly 100+ contestants), but in contestants' view, it's impossible to try every possible ideas in their head and get a high score in 5 hours. If then, it's only about who tried, and who didn't.
Because of these reasons, I think that output-only tasks like "nowruz" don't fit in IOI at all. If they want to give these kind of tasks, include heuristic algorithms in IOI syllabus, write every information about input tasks in description, and give some good input restrictions, which can make problem "solvable". Solution itself also should be clear, can be solved by clear logical steps. If they can't do these, it's better giving this problem to some other competitions. At least, it's not a science at all.
Auto comment: topic has been updated by dotorya (previous revision, new revision, compare).
Looks like Koreans performed very poorly on this problem, this is the main reason why you don't like it.
There is very simple and intuitive BFS solution that gets 80+ points and it's up to you to keep improving it or switch to other problems.
You are given input files, they are not hidden. You could open any given file in editor and see that grid is very sparse.
I mean... you both have actual points. Best to leave your first paragraph out of the discussion.
Well, I don't believe that first paragraph fact is just notorious coincidence
I will add that scoring function was kinda poor since making improvements to 85 point solutions was not rewarding at all. Would have been better if it was at least (score/target)^2 or (score/target)^3 so that there difference between initial and best solutions was at least 30 points
For all your ruined contest, do you always say that contest sucks? At least I try to say so when problem is really bad. I know Koreans performed very poorly on this problem, and I also think that this problem sucks. So what?
Talking about real points, is that "simple and intuitive" solution correct for every possible cases? I believe not, it just gives good answer for those input files.
So it's all about input files' property. Is there any descriptions that grid is very sparse, it forms one connected component, or something else? There was no description about it, so we have to figure it out, looking all input files and finding properties. Is that a reason to give output-only problems? Are we doing IOI, or some treasure hunt?
That the grid forms a connected component is actually easily derived from the fact that there would be no valid output otherwise (you are supposed to print a single tree, and you can only remove vertices).
I think that grid may not form a connected component, because we can erase all the vertices except one connected component. (Example shows the case of two connected components) But yes, even though there are many connected components, problem difficulty doesn't change a lot.
Oh, you are right! I didn't think about that. Thanks.
Yes, problem difficulty wouldn't change at all, as you would have to consider only one component to build the maze, and fill the others with bushes.
I was participated in ioi 2013, and I was surprised in artclass task in contest. I was studied image processing skills before contest, and I got a 80+ score. But as you see, I wrote that artclass is not fit in ioi. (I think artclass is worst task in ioi)
If you talk about problem is bad when only get a great score, then discussion will be biased somewhere. Plz don't talk about result in this article.
And more, input file has a special cases, but it does not always makes a solution(Example) You, and almost everyone solved special cases on problem Nowruz. I just want to add some content about input file spec(what author's solution used) in statement. I think this makes problem more logically.
What is zigui's Codeforces rating? 1) 1234 2) 2345 3) 2456 4) 2488
Obviously 3) or 4) since he's red
Wow you already have 70+ on P1,solution is so intuitive!
That logic works around 97% of the time. :)
This approach is very nice. I used this because original writing does not show my CF color, but not CF posting. Plz be kind of my poor example. :)
A counter argument one could make it that arguably output only tasks level the playing field much more (and hence test "logical skills" far better) than other tasks which significantly advantage people who have done lots of preparation for this contest.
(I don't think this is actually true — I think both types of tasks advantage people who are familiar with similar problems more than anything else.)
I will express an opposing opinion. I am biased since I did well on this problem and I also got 100 on Art Class back in 2013 and I absolutely loved it. However, I think that you are biased in the same way, just in the opposite direction.
I think that output-only problems or generally marathon-type problems are absolutely awesome and should be used more often in IOI.
The sample inputs are provided to the contestant so "important condition about input data that was not mentioned" just doesn't sound right to me. It is right there in front of you, that's part of the problem — playing around with the specific test cases you have to solve, not the general case. An example which is much more extreme in this regard would be Forbidden Subgraph from IOI 2006 (Mexico). The sample tests in that output-only problem were extremely specific and it was not mentioned. I would agree with arguments against that problem, but in nowruz you could just open the files and see that they are mostly empty grids with two of them being completely empty.
Another argument is that the results didn't correlate with the expected ones — that's why I like these problems. They're different and they require a different way of thinking about them and hence people who solve only regular problems often find it hard to have the right mindset. Now, if somebody had an intuition that a solution would perform badly, but it in fact would have performed well — that's more than clearly the contestant's fault of bad intuition. I explained my solution to that problem and it couldn't be more intuitive.
Output-only problems and generally marathon-type problems offer very different requirements. You have to think of the average case (in marathon-type problems) or only the cases you are provided with (in output-only problems). You have to get out of the usual mindset of "if I can find a bad counterexample, it's a bad solution". I find these types of problems much more practical and I really like the way you have to approach them. I often participate in competitions with problems of this type and I can assure you that if you are used to the type of problems then you can much more easily guess what would perform well and what wouldn't — it isn't illogical at all.
Whether these types of problems are fit for IOI is a difficult question. My opinion is 'absolutely yes' simply because they are different. I'd like to think the IOI problems are something more than just a regular difficult Codeforces round D/E-s. I like new ideas or clever tricks not requiring prior knowledge, or in this case — different type of problems — a breath of fresh air from the millions of annoying DP optimisations or 3D trees.
That's just my opinion. I've gotten the impression over the last few days that most of the people on Codeforces agree with yours, but I still think people should give it more thought.
Warning! The following text is a joke.
Here is a cool problem -- you have to submit a single integer from 0 to 100. If you are the only one who submitted that score, you will receive it, otherwise you will get 0 points. Fits those criterias. :)
Warning! Joke is over.
On a serious note, I still maintain my position that marathon like tasks require way more time to be "fair". Let's take the nowruz task. Now assume every single IOI contestant writes it in IOI-like conditions for 2 weeks straight. I would count those results as "fair" -- everyone had a chance to explore multiple ideas, analyze their performance, optimize them and so on.
Now imagine IOI had only this task for a day. The question is -- would the 2 week results correlate with the 5 hour results for this task? I strongly believe no, and until someone convince me otherwise I would maintain my position that there is a huge element of random involved with marathon-like tasks at the IOI. And when you add the remaining two tasks, which in this case were hard as well, the situation is even worse.
I understand your point, but to me a marathon problem for 5 hours is a completely different thing compared to a marathon problem for some weeks. I approach them very differently, and I like both. I think the reason the random element seems so huge on IOI is simply because too many people are unfamiliar with the format, but I do agree that it is bigger than it would be on a regular task or a long marathon — I just don't think it's big enough to discourage giving such problems at all.
I think you are missing one important point, that is easy to miss for an observer but as you actually participated in the contest, I'm quite surprised you didn't mention it. At IOI, you have no idea how good is your score during the contest, so can't adjust your decision. Given that you don't have enough time to try all ideas, I'd say it generates pretty random results, and this is not something you want.
I was a contestant at IOI 2010. The first day featured two easy problems and two approximate ones (I believe one of these was output only). I got 100 for easy ones and ~80 for approximate ones and still had some time left. At that point, I was pretty sure that 80 is an awesome score for IOI task and also that 2x100 + 2x80 is even more awesome. I had an idea how to hunt probably like 1-2 points in the remaining time. Instead, I tried coding a different approach that I didn't finish in time and got no increase at all. It turned out that a lot of people scored ~360 that day and the 1-2 points changed a lot. In fact, I missed gold by just one point.
One more thing about approximate tasks in general is that they produce too various scores. In the Day 1 thread someone said it's good but I don't think so. Place 76 (last silver) is at 249.42 points. Place 77 (first bronze) is at 249.01 points. Do you really think that their performance was so much different that they deserve medals of different values? Many contests choose the medal cutoffs such that the gap is big enough. Problems with too many possible scores remove gaps from the scoreboard.
I agree that this kind of problems is nice, but I don't think they fit at IOI.
That part about performance being different enough or not sounds weird to me. I think that giving medals for 13th place at ICPC WF "because they were not too far from 12th" is basically disrespectful to the team ranked 12th, like saying "Look, we don't think you actually beat team ranked 13th, that's why we give them medals as well", and in general I don't like competitions where rules (including the rules about deciding winners) are not well-defined. In sports difference in 0.01 seconds can be enough to distinguish bronze medal and #4, or even 0.001 if we talk about some racing. Why do you think it should be different in competitive programming?
I actually think that having more possible scores is better because it decreases chance of some complicated situation like having a big group of people with same score such that decision on this group will make number of medals either too big or too small.
Why not think about it as keeping 12th team just as they scored (because that's what happened) and saying to 13th "Look, you were almost as good as 12th"? I'd rather say it would be disrespectful if 12th team didn't receive medal because gap 11th-12th was too big, but I haven't heard of that happening.
In most sports, you award exactly 1 medal of each color so there is no place to make rearrangements. Also, for example in fencing they use knock-out system and award two bronze medals (except Olympic Games) so they don't have to compare both semi-finals losers.
Naturally, this is science! Most of the problems that programmers solve in real life are the problems of such type, where deterministic solution could not exist at all. Most computer scientists write articles and research topics, primarily related to real life. Of course, it's not a stupid synthetic problem from the typical contest, where "Bob plays with array of size 2·105". We need no to play with arrays in real life, we want computers to speak and to understand speech, to analyse images and videos, to recognise objects and actions. That's why machine learning is so fucking popular now, because it helps to solve problems from real life. Best ML algorithms, ideas and approaches always have a lot of analysis and mathematical justification. And it is obviously science, even if you can build some counterexample or fool the computer with leopard sofa. And I think such tasks have the right to be present at the International Olympiad in Informatics.
Yes, first you are gambling with black box magic to make it work and then once it starts to work you go into analysis and justification in order to find some reasonable and persuasive explanation why it works.
Talking on the issue discussed — I'm not going to say that such tasks are better or worse than classic problems, but your reasoning doesn't make much sense to me. Science is science, competitive programming is competitive programming; using some problems just because they are "more like real life" or "closer to the actual science" will not justify them not fitting into current competitive programming traditions. I'm against bullshit problem statement like "Little John got an array as birthday present from his parents" but all these rants about making ICPC or IOI closer to "real programming" or "real science" feel stupid. You have research marathons on real-life problems, you have hackathons, you have all sorts of contests — it's fine to try something new in IOI or ICPC and see how it goes but it is not OK to say that we have to use some things just because it is closer to real life problems.
P.S. Not sure from your comment how much you actually support the point of view that I described above, so maybe my comment is not too much related to your ;)
Of course I don't say: "we have to use some things just because it is closer to real life problems.". It was just a counter arguments. My reasoning is firstly an attempt to refute some of the TS's theses, such as "it's not a science at all.", so I say "it's more science than other common problems in CP".
BTW, are such problems a competitive problems? Yes. Are they a programming problems? Yes. So what's wrong?)
Also he say, that he don't see logic in such problems solutions or that some strong red guys perform badly with this tasks, and I think it's his problem or problem of red guys, that they do so. And it's obviously possible to be strong in such tasks and increase own skill for them, so you will not think "I don't know which idea will work better", you will think "I know this idea is shit, and this will work", or "I will code solution, that will be possible to test different ideas."
Also he asks some very simple question: "Analyze what?". Obvious answer: Analyze tests, analyse data. There is nothing shameful in this =) With such tasks you do not need to put yourself in the box of strong evidence. Your task is to get better results, not ideal 100% result, just best results that you can.
Concerning the contest format, I agree that in ACM-ICPC format such type of tasks is not good, because it's hard to make it require as much time to solve as other problems require. But in IOI, when you have 3 tasks with partial scoring and a lot of time to think and to code, I think it fits very well.
I can see two main problem with tasks of that kind:
1. Inputs have unspecified properties
2. If you came up with a solution you can't say anything about how good this solution is until you code it
I think that the first thing is not bad since the input completely specifies the input.
But I think that if the second thing holds then it is not an ACM (or IOI) problem. And I absolutely hate these problems. I don't know anything about IOI problems mentioned in the post or in comments. But there are other examples: many recent problems of Petr (so to me seems like a notorious coincidence).
Problem; There are two algorithms to generate random convex polygon. Given the polygon, determine which algorithm was used to generate it. You should guess correctly in 67% cases.
Our team (trying to solve this problem):
- OK, let's write these two generators and see if polygons somehow different.
(20 mins on coding generators) . - Well, it looks like one generator produces polygons with slightly greater area and smaller diameter, let's try it.
(another 20 mins on trying some functions and adjusting parameters).
- Our best solution guesses correctly in 67% cases, but it is not enough to pass all testcases.
Petr during the editorial: if you'll use some other functions you'll get accuracy of 72%.
How are we supposed to guess the function? How can you prove (on paper) that it will work better?
I think that problemsolving should be done on paper almost all the times. When you have a solution, there should be a way to understand if this solution is correct (without submiting).
All of this holds for some bruteforce problems too. "Let's write this 100! with some breaks" cannot be the model solution, full stop.
"I thought about heuristic solution (which turned out to give 80+ partial score), but I thought that solution will give me low score." — huh? For me if somebody says it, then he just don't know how to solve this kind of problems and it isn't suprising that he got low score :/ For me it's bad, that most of people on IOI is only doing binary tasks as a practicing. Poland invites everybody for contests like deadline24, marathon24. Also codechef every month has long contest with marathon-style task.
I'm not saying that I like "nowruz", because simplest solution that generates random tree (without looking at leaves during generating) gives ~88 points. There are two options: 1. Jury wanted it to be a simple task to get many points — then OK. 2. Jury thought that it's hard to get many points — then it's their mistake.
But if somebody didn't know, that he should try various solutions, and then see which one was the best — then he isn't familiar with this type of problems. It's similar to not being familiar with geometry/strings/graphs problems, and means that contestant isn't prepared well.
"It's similar to not being familiar with geometry/strings/graphs problems, and means that contestant isn't prepared well." — you're forgetting fact that such tasks don't belong to IOI. Marathon-style problems are included in no national olympiads, people are not training on them and even though some like them, common agreement seems to be closer to fact that they don't fit IOI. I like these tasks, see many of their advantages, but IOI is not the right place for them. Don't force people to compete on tasks they do not like and are not trained for them (and before comparing it to strings etc think it through once again). If somebody likes them then let him compete in deadline24, marathon24, TopCoder marathon etc. but let's keep pure-standard-format contests that should remain pure-standard-format.
Why IOI is a pure-standart-format contest? It's not CF div1 regular round, it's International Olympiad in Informatics, and even the official syllabus says, that such tasks may be used.
"Nowruz" remains an algorithmic task, only with changed scoring system. Contestant do not need some special knowledge or training for it, in fact. He just need to adapt to problem statement, and do what the task want from him.
"Artclass" also does not go beyond the informatics and programming. Logic and programming skills are still needed to solve it.
And as I say before, Marathon-style problems fits well partial scoring system of IOI.
Like we train on regular IOI problems, just a little training can make you much better at MM-type problems.
Most important thing is iterating quickly. You can spend ~2 hrs in one problem, so each iteration shouldn't be longer than 10 minutes. Also, there can be a long queue in IOI, so you'll need to make local tester/evaluator as well.
But these kind of practice feels like regular work in industry, not a research. So I generally don't like these problems in IOI.
Does someone have experience on experimental problems in IPhO or IChO? I want to know how they work and whether they're similar to output-only problems in IOI.
PS: These kind of problems are prevalent in IPSC and most liked by my team of 30s :) So another proof why they don't fit for IOI.
I don't understand what you guys mean by "these tasks don't fit IOI". What is your definition of "fitting IOI"? Such tasks have been around (that is, appeared on IOI) for quite some time, it is not like this was the first year to introduce an output-only problem. Can I by the same logic say that grass doesn't fit football since I have always played on sand thus I am not used to it?
And don't think that such problems don't appear on national olympiads. I can guarantee the opposite for Bulgarian and Romanian OIs. IMO, this makes a contest more beautiful.
For all of you who say "You don't know if what is on your mind will work and should come up with random thing, hoping it will give you high score", please read one of the many comments by Encho where he explains his thinking process on this particular problem and rethink your point.
"K but there are many examples of marathon/oo problems where pretty random things give unexpectedly high scores", you may say. This clearly sucks but I think this is just an example of bad problems, not a case of bad type of problems. For example, if you solve data structures problems which ask you to code a 2d treap for N-th time but this time replace sum with max/gcd, you will eventually get tired. This doesn't mean there are no data structures problems with beautiful ideas which don't ask you to code persistent Aho-Corasick.
So do not exclude a whole category of problems just because of the few bad examples you've seen.
Can I by the same logic say that grass doesn't fit football since I have always played on sand thus I am not used to it
I believe that it is like a difference between mini-football and normal football. At first sight they may look similar, but in reality they are completely different. Yes, that ball used in mini-football cannot be used in normal football (it seems confusing, isn't it?). Also football stars like Messi, Ronaldo won't play mini-football as well as normal football.
As for me OO-tasks on IOI seems like a big gates in mini-football. Everyone can kick from far distance and score...
Grass is part of football and has always been such (at least in what we recognize as football today).
My point was that OO and marathon problems are also part of IOI and have been such for quite some time. This is IOI. Like it or not — this is it, such tasks have appeared, appear and I hope they will appear in the future. So "this doesn't fit IOI" sounds like bullshit to me since it's not a new element to be added, it is and has been a part of it.
Now here comes the question "Can we redefine IOI to make it better?"
Sure, there seems to be nothing wrong in trying to make IOI better. But there needs to be a good reason to do so. As I said, don't hate a particular type of problems just because of some bad examples you have encountered.
Since this " For all of you who say "You don't know if what is on your mind will work and should come up with random thing, hoping it will give you high score", please read one of the many comments by Encho where he explains his thinking process on this particular problem and rethink your point. " was a final straw for me, I link my comment which is relevant http://codeforces.net/blog/entry/53550?#comment-377159