Generators with testlib.h

Revision en3, by xuanquang1999, 2019-08-26 11:02:16

Generators are helper programs that output testcases. Most programming task usually require has a large input (for example, an array of up to $$$2 * 10^5$$$ elements, a tree of up to $$$10^5$$$ vertices). In these tasks, adding testcase manually is obviously not sufficient, and generators come to the rescue.

If you are writing a generator in C++, it is recommended to use the testlib.h library.

Types of generators

There are two types of generators:

  1. Single-test generator: output exactly one testcase in a single run. Usually, in order to generate several testcases, one must run the generator several times with different command line parameters. Such generators output the testcase to the standard output stream (to the screen).
  2. Multiple-test generator: output many testcases in a single run. Such generators output testcases to files (one file for each testcase).

An example single-test generator with testlib.h

The following generator output a pair of integer with each element from $$$1$$$ to $$$n$$$ — where $$$n$$$ is a command line parameter passed to the generator.

#include "testlib.h"
#include <iostream>

using namespace std;

int main(int argc, char* argv[])
{
    registerGen(argc, argv, 1);
    int n = atoi(argv[1]);
    cout << rnd.next(1, n) << " ";
    cout << rnd.next(1, n) << endl;
}

Why testlib.h?

On the surface, it seems that testlib.h is not necessary to write a generator. This is actually not true. Almost all generators need to generate random values, and it is tempted to use rand(). However, this is a bad practice. A basic principle of writing generators is that: a generator must output the same testcase when compiled by any compiler on any platform, if it is run in the same way (using the same command line parameter). When using rand() or C++11 classes like mt19937/uniform_int_distribution, your program will output different testcases when compiled with different compilers.

The random value generator in testlib.h ensures that the same value is generated regardless of the (test) generator and platform. In addition, testlib.h has various conveniences for generating tests, for example, rnd.next("[a-z]{1,10}") will return a random word of length $$$1$$$ to $$$10$$$ from letters a to z.

Translator's note: There are more issues with using rand() aside from the above one. Refer to this blog for detailed explanation about these issues.

Available method

To initialize a testlib generator, the first line of your generator must be of the form registerGen(argc, argv, 1); (where 1 is the version of the random number generator used). After that, it will be possible to use the rnd object, which will be initialized with a hash from all the command line arguments. Thus, the output of g 100 and g.exe "100" will be the same, while g 100 0 will be different.

rnd is of type random_t. That is, you can create your own generator, but usually this is not necessary.

rnd has many useful member functions. Here are some examples:

Call Return value
rnd.next(4) An equiprobable random integer from $$$0$$$ to $$$3$$$ (inclusive)
rnd.next(4, 100) An equiprobable random integer from $$$4$$$ to $$$100$$$ (inclusive)
rnd.next(10.0) An equiprobable random real number in the half-interval $$$[0; 10)$$$
rnd.next("one|two|three") An equiprobable random word out of 'one', 'two' and 'three'
rnd.next("[1-9][0-9]{99}") An equiprobable random 100-digit number as a string
rnd.wnext(4,t) wnext is a method of obtaining an uneven distribution (with a biased expectation), the parameter t denotes the number of calls to the maximum operation for similar next calls. For example rnd.wnext(3, 1) is equivalent to max(rnd.next(3), rnd.next(3)), and rnd.wnext(4, 2) is equivalent to max(rnd.next(4), max(rnd.next(4), rnd.next(4))). If t < 0, then -t will find the minimum. If t = 0, then wnext is equivalent to next.
rnd.any(container) A random element of the container container (with random access via an iterator), for example, it works for std::vector and std::string

Also, please do not use std::random_shuffle, use the shuffle from testlib.h instead. It also takes two iterators, but works using rnd.

Translator's note: If my understanding is correct, rnd.wnext is defined as follow:

$$$wnext(i, t) = \left\{\begin{matrix} next(i) & t = 0 \\ \max(next(i), wnext(i, t-1)) & t > 0 \\ \min(next(i), wnext(i, t+1)) & t < 0 \\ \end{matrix}\right.$$$

Example: generating an undirected tree

Below is the code of an undirected tree generator that takes two parameters — the number of vertices and the degree of its elongation. For example, for $$$n = 10$$$, $$$t = 1000$$$, a chain will probably be generated, and for $$$n = 10$$$, $$$t = -1000$$$, a daisy (asterisk) is likely to be generated.


Tags testlib, generator, generators, polygon

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en9 English xuanquang1999 2019-08-26 16:41:29 40
en8 English xuanquang1999 2019-08-26 12:07:41 42
en7 English xuanquang1999 2019-08-26 12:07:01 0 (published)
en6 English xuanquang1999 2019-08-26 12:05:38 423
en5 English xuanquang1999 2019-08-26 11:35:07 1302
en4 English xuanquang1999 2019-08-26 11:13:22 1545
en3 English xuanquang1999 2019-08-26 11:02:16 857
en2 English xuanquang1999 2019-08-26 10:37:56 7524 Tiny change: '2. **Multi-test gene' -> '2. **Multiple-test gene'
en1 English xuanquang1999 2019-08-26 08:09:37 320 Initial revision (saved to drafts)