Link to my code: https://codeforces.net/contest/1388/submission/88538912
I think I've pretty much explored all avenues for optimization (besides switching to C++). The number of operations is around $$$2\cdot10^8$$$ in the absolute worst case, so I don't have any idea what could be slowing down my code. Is there anyone who's more knowledgeable in me at Java that can identify any further optimizations or possible bottlenecks in my code?
You can try using <<1 rather than *2 (remember to put parentheses). It won't be significant but it's something :)
To be fair, the code seems decent. Tagging some Java GMs to get attention
SecondThread uwi eatmore cwise Lewin
You forgot about Petr :)
I didn't include Petr or Egor because they no longer use Java. I guess it wouldn't hurt to tag them though.
Right, I guess the knowledge is still there
Don't use(Never) $$$Arrays.sort()$$$ instead use $$$Collections.sort()$$$ because $$$Arrays.sort()$$$ is $$$O(N^2)$$$ in worst case since it's uses quick sort on the other hand $$$Collections.sort()$$$ uses merge sort which guaranteed to be $$$O(NlogN)$$$ one of such instance
Use Integer[] array is okay with Arrays.sort(), and since java has automatic unboxing you just need that deceleration, the advice is to not use a primitives
@above: you are correct. Let me elaborate a bit more.
Arrays.sort(Object[] o) has O(n log n) time complexity.
The only sort that has worst case O(n^2) is Arrays.sort for primitive types. For example, calling Arrays.sort(new int[10]); will sort with quicksort but Arrays.sort(new int[10][10]); will sort with O(n log n) time complexity (although throwing an exception as int[] can't be cast to Comparable<int[]>
In my experience, there are two major instances in which Java just falls on its face runtime wise: n^2*log(n) geo problems and TreeSet problems with a lot of maintenance. I don’t think storing things as shorts actually helps at all unless your issues is an MLE.
Math.max and Math.min are for no explainable reason ~35% slower than writing your own version with a ternary operator.
Like skittles said *2 to leftshift works as well usually, but that shouldn’t be an issue here.
Last thing: it looks like your bottleneck is the sort which happens n*n*log times, compared to everything else which happens O(n) times. You can improve that by not putting nulls in the actual arrayList and instead just storing a count for how many of those you have.
Oh, real last thing: sometimes arrayLists are unreasonably slow too, so using arrays is usually better if you can. Sorting arrays is only trash if they are of primitives (since it doesn’t need a stable sort for primitives, so it quicksorts). That might not help too much, but it is something.
Some questions:
Math.max and Math.min are for no explainable reason ~35% slower than writing your own version with a ternary operator
The implementation for Math.max(int, int) I think we can safely assume that it is the same with other primitives:
Just curious, why would this be slow, as compared to a ternary operator?
Also, in your (short) template, you have the sort(int[]) method, which basically copies an array to an arraylist, sorts the arraylist, then copies back to an array. When comparing this method and another method of shuffling (implemented by making n random swaps) the array, this method is about double the speed of the shuffling method (sometimes triple). Why do you still use this method?
I just benchmarked Math.max() again. On Java 11 it seems to be pretty much the same as my own ternary operator. In previous versions, it has accounted for huge differences in some submission times. I think what was happening was setting some flag to do something in a thread-safe way. This happened every time you called a static method I think, so that was really the problem, not the implementation of the actual method. I updated Java recently and can't reproduce the slowdown anymore though.
The reason for my sort was that it is faster to type and I just made my template one day when I needed to sort like $$$10^5$$$ numbers on a CF problem and I just wanted something that was n*log(n). Then I just pasted it into my template. After some testing, shuffling is about 3x faster, so I'll probably use that now, despite it doubling the length of my sort method to 8 lines.
I think that Math.max may have a slowdown in some Java versions. I know a friend who submitted a USACO (which uses Java 8_121)solution which did Math.min around 1e6 times. It TLE'd on 8 of the testcases and was extrememly slow. I then removed the Math.min (as it was an attempt for an unecessary optimization) and it got AC with only around 1000 ms (in contrast to the >4000ms tle)
Also, a small trick for swapping, you can do a^=b;b^=a;a^=b; Much faster than doing the addition trick or setting a tmp variable (I haven't tested but I'd assume that xor would be pretty fast).
There is a bit of fluctuation but theyre always around the 2820-2840 range
Well those are pretty much exactly the same code. The suspected difference is in calling a static method in another class vs calling a non-static method in your own class, not the ternary operator
I think a Pair class holding two ints is faster than using int[2]. Maybe making the Intersect class static can also improve the speed.
Doable but definitely not advisable: https://codeforces.net/contest/1388/submission/88543263
WA41 is from your original code, I didn't read the statement.
For reference, I also solved this in Java in 2854 ms (88545032), though I had to optimize mine a tiny bit as well (I changed a
Pair<Double, Integer>
class to a custom class that stored adouble
andint
to eliminate some autoboxing, and that sped my code up by almost 2x).Anyway, I tried testing your code in custom invocation on a random n=2000 case I generated, and it takes 5366 ms, but also gives WA (at least, it disagrees with my code).
I tried isolating parts of your code, and it looks like when you sort
intersect
that takes 4100ms, so speeding that part up will help a lot more than anything else, if you can think of a way to do so.I actually got my Java solution down to 639 ms.
Submission: 88545616
Like you, my limiting step was sorting an array of $$$N^2$$$ objects, which is slow in Java. So instead, I used some black magic to encode a
Pair<Double, Boolean>
into a single primitivedouble
that still sorts the same way (by using 1 bit of the mantissa to represent the boolean).Sadly my logic is a more complex than a boolean and double but you've given a fairly large speed up on TC #22 by deleting some items in my object which gives me hope to push my code through eventually.
Make Intersect static class, it could reduce a lot of memory.
[deleted, a wrong idea]
The best optimization is just rewrite solution on C++.
I see a problem here: an array
intersect
with 8000000 objects. To optimize, I'd use an array of primitive type (such asint idx[]
) and a custom in-place sort implementation with comparator. Also, the performance ofFloat.compare
is suboptimal, because it has code for handlingNaN
s.