Here's the problem: here.
My submission got an AC, sure, but it ran in O(N^2 log(10^6)), and it took >2000 ms to run (even with optimised MOD). But there are so many submissions, which I think are the same complexity, which run so fast!
For example, just look at the first 2 here!
The first one, I don't understand, I think it uses some magic instructions (wtf, 100'000 instead of 100000 and mod_t!!?) (by ordcoder). The second one, by mmaxio is even in Java, and I'm almost completely sure it's the same O(N^2 log(10^6)) algorithm as everyone else.
So how did these people (or anyone else with some fast submission) get their runtime down so low?
Can someone elaborate?
Edit 1: Thanks to ned_chu, I've replaced the inverse modding to be linear. My new submission is here
Thanks,
minimario