Problem Parser Contest

#	User	Rating
1	jiangly	3976
2	tourist	3815
3	jqdai0815	3682
4	ksun48	3614
5	orzdevinwang	3526
6	ecnerwala	3514
7	Benq	3482
8	hos.lyric	3382
9	gamegame	3374
10	heuristica	3357

#	User	Contrib.
1	cry	169
2	-is-this-fft-	165
3	Um_nik	161
3	atcoder_official	161
5	djm03178	157
6	Dominater069	156
7	adamant	154
8	luogu_official	152
9	awoo	151
10	TheScrasse	147

The coding phase has been finished. Please wait for the final testing and results.

Good day everybody.

Everything written below is not a April Fool's joke. Though it is also a funny occasion :)

We are glad to announce a new experimental contest on Codeforces, "Problem Parser Contest".

Have you ever made training sessions, using the previous contests' archives? Have you downloaded from the Internet contest archives to test for your solutions locally? If the answer is 'yes', then you couldn't help noticing that each organizer insists upon inventing his own way (format) to distribute problems. Yes, it is true!

I've prepared and organized a huge number of trainings based on the previous contests. It is often rather a monotonous task; one often has to write some scripts that rename the input/output files of the problem into the standard form. In archives they are called very differently: they can be river.in.1, river.out.1, river.in.2, river.out.2, etc. In another problem they can be tests/1.dat, tests/1.ans, tests/2.dat, tests/2.ans etc.

At some point I got fed up with it and wrote a script that told the patterns of paths to the test files, using some heuristics and the regularity I'd noticed. The script worked well, but it wasn't perfect.

Codeforces is working on functionality to improve the project as a training ground. We plan to give the users the opportunity to prepare and organize training sessions directly on Codeforces. You will prepare the training session yourselves or choose a public one to participate. Have you prepared it yourselves? That's great! Make it available to general public and people will be grateful to you.

We've already done much and worked out some part of the software. There is much to do, many technically challenging moments and simply mundane work. However, I am sure that the community of young and talented programmers can cope with some problems no worse than the Codeforces team. For example, it can be a problem of archive contents' automatical recognition.

In a nutshell, you are given a set of files that relate to that problem. Your program should find a way to pick out test files (input files and output files), solutions and the checker source code. That should be based on the found patterns, heuristic solutions or on something else.

The key point is: the winner's code will be issued into a special small library that will be distributed by the rules of free license. Codeforces will use this code to make mundane processes automatic. You will really help us and your name will be part of Codeforces's history of great deeds and the code you've written will regularly serve the society. Have I persuaded you to take part there yet?

As nearly all the Codeforces code system is written in Java, we expect the solutions to that problem to be on this very language. We don't think you will need some profound knowledge of Java. Basic proficiency in the language should be enough.

We even decided to set up a small prize fund. The prize money will be distributed between five best participants:

1 prize — 280 dollars,
2 prize — 140 dollars,
3-5 prizes — 70 dollars.

Problem Statement

Your task is to write a java program that will be run from the catalogue, containing the files of one or several problems.

The relative paths of all files that relate to the problem in question are listed in the files.lst file, which is located in the root. The separator in paths is character '/' (the UNIX separator).

For example (files.lst):

tester/check_a.pas
tester/check_a.exe
a/tests/02.in
a/tests/01.ans
a/tests/01.in
a/tests/02.ans
a/tests/gen.exe
a/tests/t.bat
a/tree_mm.exe
a/tree_mm.cpp
a/doall.bat
a/problem.xml
a/validator.cpp
a/tree_dm_wrong.cpp

The program is allowed to read and analyze only the files listed in the files.lst.

The program's task is to pick out a set of tests in the given problem (if there are several sets of tests, the program should pick out the official one), the solutions sources (if they are found) and the checking program source code (if it is found).

The result of the work should be output in the standard output stream. The output should contain from one to three sections: the tests' section, the solutions' section and the checker section (only the tests' section is obligatory).

The tests' section should start with the "tests:" string. Then it should contain one or more string, consisting of the actual test's file name and the name of the file of the response to the test. The two names should be separated by a ':' character (a column). The tests should be listed in the order of increasing of their numbers in the original problem. You should only output official tests (if several test sets are found).

For example (the tests' section):

tests:
a/tests/01.in:a/tests/01.ans
a/tests/02.in:a/tests/02.ans

The solutions' section, if it is present in the output, should follow second. The section starts with the "solutions:" string. Then follow lines that contain the names of the files of solutions' sources. The order in which file names follow is lexicographical. Right and wrong solutions alike are regarded as solutions (the wrong ones are also present in the jury archive often).

For example (solutions' section):

solutions:
a/tree_dm_wrong.cpp
a/tree_mm.cpp

The checker section, if it is present in the output, should start with a "checker:" string and it should consist of a single element, the file name with the checker source.

For example (checker section):

checker:
tester/check_a.pas

Thus, the complete output should look like this:

tests:
a/tests/01.in:a/tests/01.ans
a/tests/02.in:a/tests/02.ans
solutions:
a/tree_dm_wrong.cpp
a/tree_mm.cpp
checker:
tester/check_a.pas

All the file names in the output should be in the exactly the same form in which they are represented in the files.lst file.

Your solution will be run with limited used memory "-Xmx32M -Xss1M" (the heap size is 32M, the stack size is 1M). The running time limit is 1 second.

The main class of your program should have the name of ProblemParser and be present in the package by default.

How it will be held

The contest will last for two weeks and will be over on April, 15 at noon. We haven't yet finished writing the system of the contest's organization, and we will improve and expand it soon.

The tabs you already very well know, await you: send, my packages, status and the participants' position. You can either send one file (ProblemParser.java), or a zip archive with your project source code.

This two weeks testing will be held on 30 tests. The first ten of them are open, you can download them (updated). The answers to the tests are available in the archive; they are located in the answer.lst folder. The rest 20 tests are unknown to you. We sort the participants by the non-increasing of points; when participants have the same number of points, they are sorted by the decreasing of time of the last attempt. You can send the solutions as many times as you like, if there will be a queue, we will limit the frequency of sending the packages. We take into consideration the point for the last solution of the problem.

The solution can earn one point on each test. A partial mistake on the test may lead to failing it.

When the main part of the contest is over, we will hold a final testing. For each participant we will test his last attempt on the expanded set of tests. The set may contain 20 new tests and 20 tests (11st-30th) you've seen before.

All the tests are real problems (or perhaps, their slightly modified versions). The problems are taken from various sources — we tried to maximally cover the geography of their origin.

If this experience proves to be fruitful, we will try to expand the experiment. I have in mind many more interesting tasks, solving which will be useful to the world.

May the best programmer win, MikeMirzayanov

UPD: The given archive has been updated. It is recommended to download it again.

Comments (27)

Show archived | Write comment?

ahmed_aly

14 years ago, # |

+12

I used Google translate but I didn't understand everything, but it looks very interesting.
I'm waiting for the English translation.

→ Reply

Hackson

14 years ago, # ^ |

I think Google translate did the job well on translating the idea, at least I know what to do briefly.

By the way, does files.lst in e04 intentionally contain misleading information? I found that there is a folder swe, which contains testdata but the folder does not appear in files.lst, instead it is contained in files.lst.old.

MikeMirzayanov

Yes, we prepared a testdata manually and did a mistake. File files.lst in the test 4 is wrong. It will be fixed today. Thank you.

Nevermind. Here's the list I have used for files.lst for e04:

swe.cpp

swe.pdf

swe.ps

swe/swe0.in

swe/swe0.out

swe/swe10.in

swe/swe10.out

swe/swe1a.in

swe/swe1a.out

swe/swe1b.in

swe/swe1b.out

swe/swe2a.in

swe/swe2a.out

swe/swe2b.in

swe/swe2b.out

swe/swe3.in

swe/swe3.out

swe/swe4.in

swe/swe4.out

swe/swe5.in

swe/swe5.out

swe/swe6.in

swe/swe6.out

swe/swe7.in

swe/swe7.out

swe/swe8.in

swe/swe8.out

swe/swe9.in

swe/swe9.out

Anurag

unofficial partial english version.

A.K.Goharshady

← Rev. 2 →

20 tests (21st-30th)

Do you mean 10 tests or possibly 11th-30th or something else?

11th-30th

lobo	14 years ago, # \| -11 Why can't you make test data by yourself or someone who provide the problems? If I can't solve the problems or can't understand the problems,how can I make test data? Is that a joke? → Reply

checkmate

I don't agree with you. If this line is true

"Everything written below is not a April Fool's joke. Though it is also a funny occasion :)"

then it's not a joke.

And the author is making the test data, of course. This is a real live problem, not an Algorithm problem so the test data must be use from the real live. I think each test is the live case, for example, the output file extestion can be .ou, .o, .ans, or something else, the output and input files aren't in the same folder,..... Personally, I just fast read the whole problem and I think I understand it fairly

tiberiu.savin

Awesome idea, although it would be better if you could accept other languages (for example Python). Given that this is not an algorithmic task and that the main focus is on finding a good answer and not the time or memory complexity I think more high-level languages should be encouraged. (I know there is also a time and memory limit but I feel that those were put there so that people won't take advantage, I can't imagine a possible solution that would require more than 32 MB and more than 1 second running time).

tcjuan

In the answer.lst file within e01, shouldn't the tests files in the output have WINDOWS appended to the front of their relative paths, since that's how they appear in files.lst?

Yes, paths in answer should be subset of paths in files.lst.

izhari

It look like MM contest at TC :)

madkite

No... In MM contest problem is always strictly determined and always have optimal solution that can be calculated if we have infinite time. In this contest you can only guess...

Togop

Not really. There have been MMs where the problem is interactive (you are to make moves before you know all the information) and depend on randomness. In these cases no optimal solution exists. For example, the ChessPuzzle problem.

TC always describe random distribution and all ranges. So... optimal solution = solution that get best summary score for all possible input data (if you have infinite time you can bruteforce all possible next opponent's moves and taking into account their probability distribution choose the best variant).

lijunle

in e09:
problem/Source/tester.dpr
problem/Source/TESTER.PAS

why .dpr is a checker, not *.pas?

zurg	14 years ago, # ^ \| 0 bacause there can be only one checker. → Reply

zurg	14 years ago, # ^ \| 0 I suppose both of the answers are correct → Reply

If you want to use problem archive to judge some solution(s) you should decide which checker to use. You should choose exactly one. In this case many things point that tester.dpr is preffered:

tester.dpr is Delphi source, while TESTER.PAS is Borland Pascal program;
there is binary file tester.exe in the files.lst and it has name in lowercase like tester.dpr;
tester.exe is win32-executable