MikeMirzayanov's blog

By MikeMirzayanov, history, 9 years ago, translation, In English

What are generators?

Generators are helper programs that output test. Most programming task usually has a large input (for example, an array of up to 2 * 105 elements, a tree of up to 105 vertices), so it's not possible to add all the tests manually. In these cases, generators come to the rescue.

If you are writing a generator in C++, it is recommended to use the testlib.h library.

Types of generators

There are two types of generators:

  1. Single-test generator: output exactly one test in a single run. Usually, to generate several tests, one must run the generator several times with different command line parameters. Such generators output the test to the standard output stream (to the screen).
  2. Multiple-test generator: output many tests in a single run. Such generators output tests to files (one file for each test).

An example single-test generator with testlib.h

The following generator output a pair of integers with each element from 1 to n — where n is a command line parameter passed to the generator.

#include "testlib.h"
#include <iostream>

using namespace std;

int main(int argc, char* argv[])
{
    registerGen(argc, argv, 1);
    int n = atoi(argv[1]);
    cout << rnd.next(1, n) << " ";
    cout << rnd.next(1, n) << endl;
}

Why testlib.h?

On the surface, it seems that testlib.h is not necessary to write a generator. This is not true. Almost all generators need to generate random values, and it is tempted to use rand(). However, this is a bad practice. A basic principle of writing generators is that: a generator must output the same test when compiled by any compiler on any platform if it is run in the same way (using the same command line parameters). When using rand() or C++11 classes like mt19937/uniform_int_distribution, your program will output different tests when compiled with different compilers.

The random value generator in testlib.h ensures that the same value is generated regardless of the (test) generator and platform. Besides, testlib.h has various conveniences for generating tests, for example, rnd.next("[a-z]{1,10}") will return a random word of length 1 to 10 from letters a to z.

Translator's note: There are more issues with using rand() aside from the above one. Refer to this blog for a detailed explanation about these issues.

Available method

To initialize a testlib generator, the first line of your generator must be of the form registerGen(argc, argv, 1); (where 1 is the version of the random number generator used). After that, it will be possible to use the rnd object, which will be initialized with a hash from all the command line arguments. Thus, the output of g 100 and g.exe "100" will be the same, while g 100 0 will be different.

rnd is of type random_t. That is, you can create your own generator, but this is not necessary in most of the cases.

rnd has many useful member functions. Here are some examples:

Call Return value
rnd.next(4) An equiprobable random integer from 0 to 3 (inclusive)
rnd.next(4, 100) An equiprobable random integer from 4 to 100 (inclusive)
rnd.next(10.0) An equiprobable random real number in the half-interval [0;10)
rnd.next("one|two|three") An equiprobable random word out of 'one', 'two' and 'three'
rnd.next("[1-9][0-9]{99}") An equiprobable random 100-digit number as a string
rnd.wnext(4,t) wnext is a method of obtaining an uneven distribution (with a biased expectation), the parameter t denotes the number of calls to the maximum operation for similar next calls. For example rnd.wnext(3, 1) is equivalent to max(rnd.next(3), rnd.next(3)), and rnd.wnext(4, 2) is equivalent to max(rnd.next(4), max(rnd.next(4), rnd.next(4))). If t < 0, then -t will find the minimum. If t = 0, then wnext is equivalent to next.
rnd.any(container) A random element of the container container (with random access via an iterator), for example, it works for std::vector and std::string

Also, please do not use std::random_shuffle, use the shuffle from testlib.h instead. It also takes two iterators, but works using rnd.

Translator's note: If my understanding is correct, rnd.wnext is defined as follows:

Example: generating an undirected tree

Below is the code of an undirected tree generator that takes two parameters — the number of vertices and the 'elongation' of the tree. For example:

  • For n = 10, t = 1000, a path graph (degree of all vertices are at most 2) is likely to be generated
  • For n = 10, t =  - 1000, a star graph (there's only one non-leaf vertex in the tree) is likely to be generated.
registerGen(argc, argv, 1);

int n = atoi(argv[1]);
int t = atoi(argv[2]);

vector<int> p(n);

/* setup parents for vertices 1..n-1 */
for(int i = 0; i < n; ++i)
    if (i > 0)
        p[i] = rnd.wnext(i, t);

printf("%d\n", n);

/* shuffle vertices 1..n-1 */
vector<int> perm(n);
for(int i = 0; i < n; ++i)
    perm[i] = i;
shuffle(perm.begin() + 1, perm.end());

/* put edges considering shuffled vertices */
vector<pair<int,int> > edges;
for (int i = 1; i < n; i++)
    if (rnd.next(2))
        edges.push_back(make_pair(perm[i], perm[p[i]]));
    else
        edges.push_back(make_pair(perm[p[i]], perm[i]));

/* shuffle edges */
shuffle(edges.begin(), edges.end());

for (int i = 0; i + 1 < n; i++)
    printf("%d %d\n", edges[i].first + 1, edges[i].second + 1);

How to write a multiple-test generator?

A multiple-test generator in one execution can output more than one test. Tests by such a generator are output to files. In the generator on testlib.h it is enough to write startTest(test_index) before the test output. This will re-open (freopen) the standard output stream to a file named test_index.

Please note that if you are working with the Polygon system, in this case, you need to write something like multigen a b c > {4-10} in the script (if it is assumed that starting the multiple-test generator will return tests 4, 5, 6, 7, 8, 9, and 10).

Other notes about generators

  • Strictly follow the format of the test — spaces and line breaks should be placed correctly. The test should end with a line feed. For example, if the test consists of a single number, then output it as cout << rnd.next (1, n) << endl; — with a line feed at the end.
  • If the test size is large, it is prefered to use printf instead of cout — this will improve the performance of the generator.
  • It is better to use cout to output long long, but if you want printf, then use the I64 constant (for example, printf(I64, x);).
  • Please be aware about various cases of C++ undefined behavior. For example, in the first example generator above, if the two cout commands are combined into one, the order of the rnd.next function calls is not defined.

Translator's note: about the third point, using lld constant with printf to output long long used to be problematic in the past, but is no longer an issue now.

Further examples

Further examples of generators can be found in the release notes or directly at the GitHub repository.

Full text and comments »

  • Vote: I like it
  • +79
  • Vote: I do not like it

By MikeMirzayanov, history, 9 years ago, translation, In English

Section about testlib is temporary, at some day it will be merged into global documentation section when it appears.

If you are developing a programming contest problem and you are doing it with using C++ then testlib.h is a right choice to write all auxiliary programs. This library is a standard solution in a professional community of problemsetters around the world. Many contests are prepared using testlib.h: All-Russian and International olympiads in Informatics, ICPC regional contests, all Codeforces round and many other competitions.

You can find testlib.h on GitHub.

testlib.h library is contained in a single header file. In order to use it you should just put testlib.h in the same directory with a program you are writing (checker, generator, validator, or interactor) and add the following line to the beginning of your program: #include "testlib.h".

Here are the cases when testlib.h is really useful:

  • In writing generators. These are the programs that create tests for your problem, since it is not always possible to type a whole content of the test by using the keyboard (at least because of their possible large size);
  • In writing validators. These are programs that read the whole test and verifies that it is correct and that it satisfies the constraints of the problem. Validators should be maximally strict with respect to spaces, endlines, leading zeroes etc;
  • In writing interactors. These are programs that are used in interactive problems, if your problem isn't interactive then just nevermind;
  • In writing checkers. If your problem allows several possible answers for the tests then you should write a special program that checks participant's answer against jury's answer and the input data.

testlib.h is fully compatible with Polygon problem preparation system.

First versions of testlib.h appeared in 2005 as a result of testlib.pas porting on C++. Since then testlib.h has evolved, its features and performance were improved. Last versions of testlib.h are compatible with different versions of Visual Studio compilers and GCC g++ (in editions for many platforms), also they are compatible with C++20.

Full text and comments »

  • Vote: I like it
  • +80
  • Vote: I do not like it

By MikeMirzayanov, 9 years ago, translation, In English

Hello, Codeforces!

As you know the Linux package managers make life easier for users and administrators. In the Windows world, this is much worse, although there are some tools (in Windows 10 it will be much better): nuget, chocolatey, wpkg and others.

Maintaining Codeforces testing servers, computers of Programming Competitions Training Center of Saratov SU, installing workstations for different olympiads I finally got tired to write a variety of bat-files and decided to regularize the process.

Chocolatey makes good help, but in the details it turned out that it can't used in some cases: you can not specify the installation directory, there is no support for private repositories, lack of some packages, Chocolatey packages do not contain installers but only links for them (if package's official is down, one can't install the package).

That's why, in December 2014, I spent out several evenings to work on a package manager most convenient for our purposes (called PBOX, reads like a pi-box). I'm considering using PBOX to install specific software for Codeforces (concrete compilers and tools), but for programs of general purpose it is good idea to use Chocolatey.

In the coming months all the Codeforces testing servers (and many other computers of the CS department of Saratov State University), I plan to reinstall, and use in particular PBOX.

I have little use it for personal purposes, it seems to me, PBOX may be useful to someone from Codeforces users. The site http://pbox.me has usage examples. Below are a few explanations.

Installation

Visit http://pbox.me and in the administrative Windows console (search for cmd.exe, get the context menu by right mouse button, select Run as administrator) run the code from the website home page. PBOX written in Java, if you're not have it, then it will download JRE automatically. By the way, every time you start PBOX, it will try to self-update, so forget about updating it manually.

I usually turn off UAC, if you do not want you will always need to use administrative console. You can disable UAC by typing pbox -uac.

Usage

Do you want to install exactly the same g++ as Codeforces has? Just run pbox install mingw-tdm-gcc. It will install it to %HOMEDRIVE%\Programs\mingw-tdm-gcc (by default), add to PATH some directories (including MSYS), add environment variable MINGW_HOME.

Generally, to see exactly what will happen on installation simply visit the site to find the package and click Show pbox.xml.

For now PBOX has only 73 packages. Visit http://pbox.me/packages to explore them.

I like the collection of useful utilities called tools, so just run pbox install tools to install sysinternals tools, windows resource kit, support tools, and others like curl, wget, imdisk (add all of them to PATH). BTW, useful utility runexe.exe will also be installed: it is good to run processes and see time/memory usage.

Most compilers and tools will be installed to C:\Programs (actually to %HOMEDRIVE%\Programs). Quite convenient to have a path to them short and with no spaces unlike "Program Files".

You can use additional options like pbox install far --homedir=C:\Far --arch=32 --version=3.0.4040. To uninstall a package you can run: pbox uninstall far. Here are more examples of usage:

Description Command line
Ask PBOX to forget that it installed a package (but do not uninstall it) pbox forget <package>
Print package information (you can specify the version) pbox info <package> или pbox info <package> --version=version
Find package by title/description/tag pbox find <query> or pbox search <query>
Find package (find in all versions, not the latest only) pbox find <query> --all или pbox search <query> --all
Print all the packages in repository (latest versions or all) pbox list или pbox list --all
Print the list of all the installed packages pbox list-installed

Download a package and explore it

It is really easy. Just try: http://repo.pbox.me/1.0/jdk8/1.8.0_45/jdk8$1.8.0_45.pbox.7z

Source code

Visit: https://github.com/MikeMirzayanov/pbox

Full text and comments »

  • Vote: I like it
  • +192
  • Vote: I do not like it

By MikeMirzayanov, 10 years ago, translation, In English

Hello!

Only a few hours before the start of ACM-ICPC World Finals 2015!

The ACM ICPC is considered as the "Olympics of Programming Competitions". It is quite simply, the oldest, largest, and most prestigious programming contest in the world.

The ACM-ICPC (Association for Computing Machinery — International Collegiate Programming Contest) is a multi-tier, team-based, programming competition. Headquartered at Baylor University, Texas, it operates according to the rules and regulations formulated by the ACM. The contest participants come from over 2,500 universities that are spread across 100+ countries and six continents.

This year the best 128 teams in the world will meet face to face in Marrakech on World Finals. Video coverage will start on 09:00 (UTC), and the contest will start on 10:00 (UTC).

Good luck to all the teams!

UPD Added link to the text coverage on tumblr.

Full text and comments »

  • Vote: I like it
  • +332
  • Vote: I do not like it

By MikeMirzayanov, 10 years ago, translation, In English

It is really great to spend holyday usefully. During the Victory Day I not only visited the Victory Park with family, congratulated people closed to me, watch the Victory Day Salute, but also I've started to get rid from codeforces.ru

Yes, it is not a mistake. We are moving forward to use the only domain codeforces.com. It will make many thing better: better navigation, better page statistics, better pagerank and other metrics.

Of course, all links to codeforces.ru is redirecting to the appropriate page on codeforces.com now. In addition, for now it applies only to GET-requests.

Observant visitors have noticed that recently changed how we deal with images. Now, if you insert a link to image into the text of a post/comment, then when you save it predownloaded and stored on Codeforces, and link is replaced by to use our domain. This solves several problems: missing or spoofed pictures in old posts/comments, the restriction on the number of views in original image server, we resize too large images to smaller size, now we be sure to use https for images, which means we are closer to use https.

Full text and comments »

  • Vote: I like it
  • +568
  • Vote: I do not like it

By MikeMirzayanov, 10 years ago, translation, In English

April 18, 18:00 (UTC) the second Wild-card round of VK Cup 2015 will take place.

Participants are invited to achieve progress in solving an unusual problem. VK Cup teams which were advanced to the Round 2 (and didn't advance to the Round 3) will take part in VK Cup 2015 - Wild Card Round 2 officially. In addition, this round will be open to the public for unofficial participation for everybody. Registration will be open for the whole round duration.

The round will be one week long. After the end latest submission (with positive score) of each participant will be judged on system tests.

Good luck!

UPD.: System testing is done. Final tests are available by link: http://assets.codeforces.com/files/vkcup-2015-wildcard-2-tests.2.zip

You can appeal on tests and their answers until April, 28th, 23:59:59. After that we will finalize the results. After finalizing all found bugs will not affect the final standings.

Full text and comments »

  • Vote: I like it
  • +164
  • Vote: I do not like it

By MikeMirzayanov, 10 years ago, translation, In English

Hello Codeforces.

This is a really good and pleasant day for us. We are happy to announce that the fundraising campaign on Codeforces is successfully completed and its results are even beyond our expectations. Thank you for your support, your trust and your kind words.

By the way, we have some more unprocessed PayPal transfers: for unknown reasons the notifications in some cases didn't contain some data needed to process them automatically. We will process them manually and update the data as soon as possible.

To all of you waiting for our present in the mailbox, please update your Codeforces profile information or make sure that it already is complete and up-to-date.

Thanks!
Mike Mirzayanov

Full text and comments »

  • Vote: I like it
  • +118
  • Vote: I do not like it

By MikeMirzayanov, 10 years ago, translation, In English

Hi!

It is only a day before the end of the fundraising campaign dedicated to the 5th anniversary of Codeforces. We are glad and grateful for your help and support. We are working hard to justify your and our own expectations.

For those who are not accustomed to rush, remember that you still have a little time to join the remarkable list of our friends — help us and get a gift from the Codeforces team!

When summing up, of course, we mostly want not to count money, but to assess progress and the work done. I looked at all of our commits to Codeforces and Polygon, and made a digest of changes.

I did not include to the digest improvements in the depths of system's backend (although stability progress should be visible), infrastructure jobs, org. work around the championships — but, believe me, there were many items too.

And here is the list of our achievements for about two months after anniversary.

Full text and comments »

  • Vote: I like it
  • +367
  • Vote: I do not like it

By MikeMirzayanov, 10 years ago, translation, In English

Hello!

VK Cup 2015 — Round 1 (unofficial online mirror, Div. 1 only) — is public unofficial replay of Round 1. It means that if you didn't take part in VK Cup (for example, you are not Russian-speaking), decided to cancel further participation in VK Cup or didn't pass a Qualification you can take part in VK Cup 2015 — Round 1 (unofficial online mirror, Div. 1 only). You should be in Div. 1 to take part in it.

Mostly it will be typical round with regular Codeforces rules. Persons (not teams) will take part in it. The problems will be in English and in Russian. It will be hacks and rating will be updated after it. It will be used recently implemented smoother dynamic problem scores with 250 points steps. You can read about it here.

There were many not Russian-speaking teams in VK Cup Qualifications. I encourage these teams to respect the position of the organizers and not to take part in official Round 1. Please, take part in VK Cup 2015 — Round 1 (unofficial online mirror, Div. 1 only).

Wish you good luck!

Full text and comments »

  • Vote: I like it
  • +139
  • Vote: I do not like it

By MikeMirzayanov, 10 years ago, translation, In English

Doing VK Cup 2015 we faced with an interesting problem: how to calculate rating changes for team members?

In short, currently the ratings are calculated using the following rules: * each contestant has some ratings ri before the round, * our goal is to follow Elo Rating System idea: participant a wins participant b with probability:

  • as we know chances to win/loose for any pair of participants, we can calculate the expected place (seed) as a sum of winning probabilities,
  • if participant took place better than expected then we need to increase rating, if worse then we need to decrease rating.

Above items are just some general rules, but we have some anti-inflation corrections and heuristics. BTW, we are moving forward to rethink rating formulas and open them. But the question now is not about that, but about how to calculate the rating of a team.

It is natural to generalize current ideas, but we need a method to calculate the rating of a team knowing ratings of members. If there is such function, say teamRatings(r1, r2) (for 2-member teams), it will be naturally to use it to calculate expected place of a team. And depending on actual place, member's ratings should be adjusted.

That's the idea came to mind of the Codeforces team during lunch.

For sure, the function teamRatings(r1, r2) should satisfy some constraints:

  • teamRatings(r1, r2) ≥ max(r1, r2),
  • if we compose a team of tourist and somebody not very skilled (say, green participant), rating of team should be close (a little more) to tourist's rating.

I was offered the following funny model. Image there is team AB composed to two members A and B. Let's try somehow to compare it with individual participant C. In my model instead of single contest "A+B vs C" we will make two contests "A vs C" and "B vs C". If at least in one contest of two won member of AB, then the team won. If both contests won C, them C won.

This model doesn't consider any team-work, but it fairly tries to consider chances of both participants A and B to overcome C.

Now we know rating of C and winning probability of AB over C, we can inverse Elo formula to find rating of AB.

There is a trick that calculation of team rating depends on opponent C. How to choose the most relevant opponent C? It is easy to show that changing rating of C the calculated team rating changes monotonically. I like an idea to choose such rating of C that calculated rating happens to be equal to C. In other words let's use such opponent that equally skilled compared to the team AB. Binary search helps to find such opponent's rating (closed formula also exists).

So we following code aggregates ratings of several individual participants to the single value:

long double getWinProbability(long double ra, long double rb) {
    return 1.0 / (1.0 + pow((long double) 10.0, (rb - ra) / 400.0));
}

long double aggregateRatings(vector<long double> teamRatings)
{
    long double left = 1;
    long double right = 1E4;

    for (int tt = 0; tt < 100; tt++) {
        long double r = (left + right) / 2.0;

        long double rWinsProbability = 1.0;
        forn(i, teamRatings.size())
            rWinsProbability *= getWinProbability(r, teamRatings[i]);

        long double rating = log10(1 / (rWinsProbability) - 1) * 400 + r;

        if (rating > r)
            left = r;
        else
            right = r;
    }

    return (left + right) / 2.0;
}

Once again, I understand that this model is not absolutely correct. But I don't think that absolutely correct model exists. I think that this model is reasonable and has a right to exist. What do you think?

Full text and comments »

  • Vote: I like it
  • +143
  • Vote: I do not like it

By MikeMirzayanov, 10 years ago, translation, In English

Hello Codeforces!

I am happy to announce that the indicated $10000 support fund was raised in a little more than three days! Thanks you very much friends!

Now it became possible to pay the writers equivalent funds even they are in rubles. It was impossible because of great increase of the dollar exchange rate.

Round type USD Rubles
Div 1 + Div 2 $250+*$50 18000 rub
Div 2 $100+*$50 9000 rub

We tie the money paid in rubles to the Central Bank of Russia exchange rate, rounded to the closest sum in rubles divisible by 5 (according to the rules of mathematical rounding). The table shows the values that were up-to-date when this post was published. An asterisk marks the bonus that is given if the round is prepared perfectly.

And that is not all! As I said before, we now have the opportunity to adequately support the coordinator of the Codeforces problems Max Zlobober Akhmdov. He started working on his position in the fall of 2014 and more than two dozens of rounds have been conducted under his guidance. Preparing rounds takes a lot of work and he copes brilliantly. I hope to work with him efficiently for many more years to come.

I asked Max to write several words about himself for you to get to know him better and that's the result:

Hi! Some personal info: I graduated from the Advanced Educational Scientific Center of Moscow State University named after A.N. Kolmogorov. When I was a school student, I went to the IOI 2012, earned the gold medal and the absolute seventh place in the Russian National Team. My biggest achievements include the 5-th place at the Russian Code Cup 2014, the 10-th place in 2013 and the 3-rd place in 2014 on NEERC as a member of the Moscow SU Trinity team. I won the all-Syberian Pottosin’s Olympiad in 2013 and the ICL 2014 cup as the member of the same team.

I like music. I listen to all sorts of heavy and not so heavy rock, I go to concerts. I play several musical instruments. I am a jury member in many school programming Olympiads. The last book I've read is ‘Chapayev and Void’ by Pelevin. It's hard to choose a favorite movie — I like movies by Quentin Tarantino, Robert Rodríguez and Christopher Nolan.

That's about it.
Max.

The fund-raising campaign will continue for quite a while and I sincerely believe that there are other participants who will support us these days. We continue raising funds (that is a standard crowdfunding procedure) and we will be happy to to fulfill all obligations of award the supporters. For example, those who supported us and chose the 'badge in profile' reward option have recently got badges of honor in their profiles, the symbol of your taking part in and being part of the Codeforces development. Certificates and T-shirts will be sent after the fund raising stops. Please, be a little more patient.

All the raised extra funds will be used to develop and support Codeforces, namely (provided that we've got enough of the collected budget):

  • increase the payments for Div1+Div2 rounds — may there be more of them!
  • introduce a content manager into the team who will regularly add and systemize the trainings;
  • we will soon definitely need new hardware and we will be able to get buy it;
  • we will strengthen the team of developers to make you happier with more cool features!

Thanks for your support,
Mike Mirzayanov

Full text and comments »

  • Vote: I like it
  • +358
  • Vote: I do not like it

By MikeMirzayanov, 10 years ago, translation, In English

Hello, friends!

Yes, this post is for friends of Codeforces. Recently Codeforces had 5 years birthday. And on this occasion we opened a campaign to raise funds for the operation and development of Codeforces.

We are eager to develop and move forward and look forward to your participation in the campaign.

We invite you to consider the possibility to support Codeforces. Details of the campaign can be found at its direct link.

Mike Mirzayanov

UPD: We have great news!

Full text and comments »

  • Vote: I like it
  • +708
  • Vote: I do not like it

By MikeMirzayanov, 10 years ago, translation, In English

Hello, Codeforces!

In late December and early January, I wrote a proof-of-concept of separate service in C ++, to take out heavy data from the Codeforces Java-code to C++. Long time I have not written in C ++, experienced a funny sense of immersion into another world.

I was surprised to find the lack of open-addressing hashmap in the C++ standard library, and indeed in the boost, and in other popular libraries. It is strange somehow, because often open addressing should be better than separate chaining both in time and memory. And since I intend to keep small objects, then sure.

I quickly sketched a prototype that actually shows that open addressing in 2-3 times faster than the standard std::unordered_map. Here's the output of the benchmark on my laptop:

std :: map takes 15779 ms
std :: unordered_map takes 4698 ms
oaht :: hash_map takes 1473 ms

I think that there is no good implementation of such container in stl-style with the support of C++11 (move semantics). Maybe someone will show, but I have not found.

Here is my prototype on github: https://github.com/MikeMirzayanov/open_addressing_hash_table Unfortunately, I'm not cool C ++ expert, and I have no time to implement.

On the other hand, on Codeforces where were many C ++- discussions, and seems there many users who understand modern C ++. Also algorithms are water and air of Codeforces!

Full text and comments »

  • Vote: I like it
  • +140
  • Vote: I do not like it