Once, during one training, my team and I noticed something strange. I'll write a little bit later what it was exactly. The point is that after the contest we decided to investigate how c++ rand() behaves on our machines (we use Linux). So, we used a Berlekamp–Massey algorithm on the results of rand() taken modulo 2.
The result is shocking — the length of the linear recurrence is only 527! No, the results aren't cyclic, just the recurrence is short.
"Cool, but who cares? Just some random fact which I wouldn't notice and it wouldn't have any impact on me." — wrong!
We discovered it while solving a task which required calculating the rank of a binary matrix. We wanted to check something, so we just generated big matrix using rand(), which turned out to be a bad decision. As only the first 527 columns were, let's say, independent, the rest were determined by first 527 ones, so the rank didn't exceed 527, and we lost much time just wondering what's happening.
Moreover, if the number of columns is even, the rank is only 496 (for matrices big enough of course). Guess what's the length of the recurrence if we look only at every second value of rand().
Yep, you're right — 496. One can notice that 527 - 496 = 31 and that 31 divides 527! Illuminati!
Here are some useful links: 1 2.
If anybody can explain precisely what's happening for even number of columns, it would be great. Also, if anybody knows how does rand() look inside and can say a few words about it, it also would be great.
Where's Blogwoosh??? xD
Ask Radwoosh.
Probably gone, just like the black R.
Have noone ever told you to never use rand()?
Wow, rand() is getting worse ...
rand() is not getting worse, Berlekamp-Massey is getting better!
Don't use rand(): a guide to random number generators in C++ by neal
This has nothing to do with the blog.EDIT: it does, my bad.
It does. The post contains the exact implementation of
rand()
that Codeforces uses, which is a linear congruential generator.The compiler Radewoosh is using has a better implementation of
rand()
than Codeforces, but it's most likely still a linear congruential generator, which meansrand() % 2
has a relatively low period.Try
mt19937
instead and see what you get.EDIT: this one might actually be a linear-feedback shift register rather than a linear congruential generator, but the point about
rand() % 2
having low-quality randomness is still true.Not just that, but 31 also divides 496! Now that cannot be a coincidence.
Here is a possible simple fix for this problem using the standard library function random_shuffle(). 1
Easier fix: use rand() % 1000007 % 2
Thanks for the constructive comment; an interesting and much simpler fix, indeed. The previous link includes a pefroamance comparison between both fixes.
The GLIBC random number generator: https://www.mathstat.dal.ca/~selinger/random/
I've learned to use a linear congruential generator, with the rand() as a seed. Seems to work pretty well.
The length of the linear recurrence of Xorshift64 is 64, I surprised. It seems to that calculating linear recurrence is weak points of some RNGs. At least it of mt19937 isn't short.
https://ideone.com/UxRMg8 (xorshift) https://ideone.com/8ueYut (mt19937) https://en.wikipedia.org/wiki/Xorshift (the code of xorshift64)
It's more likely to be a feature, but on my machine this code outputs only zeros (on Codeforces it doesn't):
And 0 is the only seed among all seeds from 0 to 100000, such that the first number generated by rand() isn't a multiple of 16807 = 75.
This caused me a lot of problems when I was stressing my solution once and wrote something like this: