Hello everyone!
I want to invite you to participate in November Clash at HackerEarth. Contest is scheduled on November, 28. Contest duration is 24 hours, so there should be some comfortable time for every timezone :)
There will be six five tasks in a problemset. Five Four of them are standard algorithmic problems with partial solutions allowed — you get points for every test that your solution passed. And the last task is an approximate problem — your score for this task depends on how good your solution is comparing to current best solution.
witua is author of this problemset. He is experienced setter — I bet you already saw his problems at Codeforces, TopCoder, CodeChef or some other site, and now it is time for him to debute at HackerEarth.
I was working on this contest as a tester. I believe it is going to be an interesting one :) I hope that several problems will be not too hard for beginners (don't give up and show your best with partial scoring — even very naive solution may give you some points; and 24 hours should be enough for you to read all problems and find some tasks which you can solve) while some tasks are challenging enough to make this contest interesting for more experienced contestants. shef_2318 worked on this contest as translator — you will be also provided with problem statements in Russian. Also I want to thank to thank belowthebelt for providing technical help and doing his best on fixing all issues and improving HackerEarth platform.
As usual, here is one more reason for you to participate in this contest:
Top5 of leaderboard will also receive some nice prizes:
- $100 Amazon gift card + HackerEarth T-shirt
- $80 Amazon gift card + HackerEarth T-shirt
- $50 Amazon gift card + HackerEarth T-shirt
- HackerEarth T-shirt
- HackerEarth T-shirt
Good luck to everybody — I hope to see you at the scoreboard :)
Upd. Contest has ended :) Thanks to everyone for participating :) Congratulations to winners:
1) anta
2) Carsten Eilks
3) Kmcode
4) HellKitsune
5) mugurelionut
The contest has started. Good luck, everyone! :)
So what's happened to the judge?
Was down for a while; should be working fine now. We'll extend the duration by sometime to compensate for it. Apologies!
I have result "attempted" and all testcases are accepted. What does that mean?
So is the sixth problem still going to be added or this is all of them now?
There will be only 5 problems, no more problems are going to be added. I'm sorry for announcement being misleading (will fix it soon).
I really like that missing task; some time before a contest we had discovered a problem with time&memory usage measurement at HE platform, and because of that problem last task loses all its beauty. We decided not to waste a nice task — I hope it will be a part of some other contest in near future.
About reusing it later: don't know about the others, but I already saw the statement while it was up for a short while, including the tricky restrictions, and even wrote a solution for it.
Can you check the judge program of the approximate problem? I made some submissions which verified that the interval [X[1],X[N]] is ~10^7 on 19/20 cases — in this case, the worst absolute score on these test cases is < 5 — however, the judge program is even showing me scores > 100000, which are not possible, unless the judge program uses a different scoring function from the one mentioned in the problem statement.
Thanks for pointing that out.
Value A in formula is squared number of inversions, not number of inversions itself. Will be fixed soon.
The scores on the challenge problem are way too high; in particular, the scores for the trivial output are way too high; this may lead to loss of differentiation between top scores in the leaderboard (it does at this point). And you don't want the challenge problem to have multiple full scores.
Maybe altering the scoring function to difference/ratio of the final to the initial score would fix it?
UPD: Not to mention I got 2nd place score on that problem with a random if (which isn't the setter/tester's fault, more like contestants not trying hard enough).
UPD2: Ok, this solved itself.
How did you approach the challenge problem?
My solution consists of minimizing the number of inversions first (i.e. every move must decrease the number of inversions by 1) and, among multiple such moves, it picks the one which maximizes the squared sum of distances. This was the basic greedy which I later randomized (e.g. at each step, I picked a random move among the top K best ones). Towards the end of the time limit I only did this for the last moves (i.e. I kept a large prefix of the best solution found so far and only tried to improve something at the end).
What's strange is that on my local tests I had an additional strategy which was giving me between 2%-5% improvement on about 40% of the test cases (and no improvement on the others), but it didn't improve anything on any of the official test cases, which I find quite weird. The additional strategy was to focus only on increasing the squared sum of distances as much as possible (and ignoring if it would decrease/increase the number of inversions). I only used this strategy for improving the last moves (about 10%) of the best solution which tried to minimize the number of inversions first. I also tried doing a greedy which always picks the move which improves the score the most (i.e. it might be OK to increase the number of inversions if this also increases the squared sum of distances significantly). This was giving me similar improvements on my local tests (when used under the same conditions), but none on the official ones. Given that the difference between the top 8 solutions was < 0.1%, an improvement of even 1% on just 1-2 of the test cases with larger scores (out of 20) would have been enough for any of them to win.
I had exactly that greedy approach. I've tried to improve it with something like annealing but it didn't give me any points on official testcases. Also I've tried some of modifications you described but my first attempt was the best one. I'm wondering what is authors solution.
Just out of curiosity, I generated 100 random tests. I chose N randomly between 500 and 1000, and K randomly between 10000 and 100000. I chose the X coordinates randomly between 0 and 10000000 (avoiding repetitions), but I set X[1]=0 and X[N]=10000000 (in order to be similar to the official test cases). I chose the permutation P randomly and I chose each value of D randomly between 1 and 100.
I then copy-pasted the best solutions of the top 5 competitors, except that in my case I used the solution which, for the last 10% of the moves, tries to maximize the squared sum of distances, ignoring the number of inversions (as explained in my previous comment). ceilks's solutions was taking too long (there is no time control in his code, so I'm guessing that his code just worked on the official test cases in time) and I couldn't compile HellKitsune's solution in the setup I had (my compiler couldn't find <bits/stdc++.h> and I didn't spend any time trying to find out how to make it recognize this header). So, in the end, I only evaluated my solution, anta's and Kmcode's.
The results on the 100 test cases are below and my solution obtains better scores than the other two. This is mostly due to the fact that the additional strategy which focused on improving the squared sum of distances in the last 10% of the moves obtained between 1% and 5% better scores than the solution without it on 38/100 test cases. On the other test cases, as expected, my solution was generally worse than the other two, but by a very small margin (while the wins on the 38 test cases were by a large margin).
However, as mentioned in my previous comment, this strategy brought zero improvement on the 20 official test cases. How is that possible? If the official test cases had been randomly generated, then there should have been around 7-8 test cases on which I should have got 1%-5% better scores than what I did. My only explanation is that the test cases were generated according to some hidden strategy which is significantly different from a random one. And this brings me to my biggest complaint about challenge problems on Hackerearth and also on Codechef. The strategy for generating the test cases is almost never described, which is an awful decision, in my opinion. If we can't generate local tests in order to tune our solutions, then everything becomes a guessing game. In the case of this problem, when I saw that a solution which was bringing me huge score improvements locally had zero effect on the official test cases, I basically stopped trying, because it meant that I was testing my solution on the wrong kind of test cases, without having any idea how the "right" (i.e. official) test cases looked like.
Results of 100 random local tests (the scores on each line are: first-mine, second-anta's solution, third-Kmcode's solution; smaller scores are better). I will upload the files somewhere and share them later for anyone interested (though you can obtain similar results by writing your own generator and using the strategy I described at the beginning of this post)
0: 118775.593255 120449.850401 121034.333374
1: 175777.354474 179931.467292 182660.491005
2: 531.401095 525.478701 531.565640
3: 14083.361750 13871.374769 14049.528068
4: 43271.604239 42615.338797 43037.820895
5: 229440.948818 234355.370203 235493.339293
6: 8468.543977 8348.384712 8449.553956
7: 20765.191134 20355.213065 20722.876004
8: 66153.103441 65946.799585 66942.328976
9: 21465.960530 21062.681433 21394.392677
10: 180313.676816 183058.700239 184447.359038
11: 2189.342829 2156.765415 2187.870864
12: 3671.068116 3627.771627 3665.717468
13: 14520.477126 14283.212222 14477.683136
14: 134693.351150 136524.129409 138722.517411
15: 0.005492 0.004444 0.000000
16: 33487.700300 33273.992054 33623.149107
17: 21947.569780 21614.321303 21852.374478
18: 45262.909914 45364.718182 45940.395344
19: 81590.884365 81480.554588 82685.592188
20: 0.168267 0.142691 0.000000
21: 1656.637277 1634.607731 1656.759530
22: 9598.209981 9445.979189 9586.128109
23: 117017.351580 119593.477492 120509.618310
24: 334.270813 330.613132 334.561708
25: 3153.370299 3113.650690 3148.972083
26: 51979.703701 52819.868027 53385.876226
27: 29669.893828 29853.968418 30167.364562
28: 31021.211354 30547.961608 30879.654225
29: 127581.370180 130076.532780 132281.548622
30: 2282.259273 2255.880674 2281.972294
31: 10064.822119 9868.336529 10041.007310
32: 27883.976786 27300.487018 27853.576393
33: 86370.507894 84766.232678 86486.095749
34: 11209.943362 11099.222906 11160.280797
35: 18934.018932 18608.380386 18870.639383
36: 2688.476182 2654.933772 2681.699868
37: 9994.539039 9786.186934 9973.090106
38: 125864.488913 128869.973054 129305.531915
39: 0.011393 0.008895 0.000000
40: 1083.051813 1070.024909 1083.352531
41: 10553.200729 10432.236471 10535.842485
42: 141849.048828 144639.894866 145798.831536
43: 0.551054 0.520647 0.000000
44: 10096.366076 9968.217139 10051.761981
45: 580.658971 572.882198 580.896288
46: 8776.439279 8612.498453 8752.525437
47: 31523.776271 30860.513929 31484.003196
48: 158054.016150 161068.239602 162778.073123
49: 4308.224529 4245.484249 4299.767027
50: 11084.775464 10857.521785 11055.225077
51: 42758.692483 41995.694069 42495.939144
52: 113756.680615 115940.220423 117499.705965
53: 0.402649 0.395810 0.402809
54: 8990.456485 8930.970726 8946.920708
55: 22157.918747 22021.379702 22207.570395
56: 50359.670605 50280.307220 50738.449711
57: 73554.882744 74714.743325 76054.446864
58: 170.974950 168.922806 170.998083
59: 1541.745733 1520.683850 1538.981892
60: 17427.172150 17122.913003 17347.208597
61: 65590.437202 64008.103485 65281.658552
62: 300.470178 297.507711 300.579359
63: 2250.266834 2203.535288 2245.928809
64: 3217.849383 3177.377405 3212.745288
65: 23862.276415 23514.502469 23768.750398
66: 69749.553179 71042.035768 71881.341771
67: 153931.598838 159496.781745 162442.580444
68: 4320.433198 4261.377842 4297.999279
69: 13703.966877 13422.147557 13658.587314
70: 100460.916986 102698.379084 103745.156410
71: 253275.121133 258091.441184 258407.175896
72: 12116.784968 11949.970400 12081.693306
73: 4954.639811 4903.666153 4946.959437
74: 24193.570723 24047.823174 24256.605802
75: 65267.924194 66834.056255 67134.887620
76: 135382.015927 140027.936887 142201.851356
77: 300.368418 296.995973 300.436205
78: 3868.452733 3802.703764 3861.658060
79: 67184.118561 68056.156239 68503.908845
80: 156314.895628 159524.611323 160497.491723
81: 7744.743766 7659.351417 7720.956067
82: 195067.253409 201764.859326 204134.506339
83: 0.723379 0.662980 0.000000
84: 8333.797364 8249.620756 8313.640940
85: 30412.643462 29990.227520 30352.935345
86: 111243.097039 111974.700032 114075.644949
87: 195.116852 192.776825 195.299753
88: 4947.566577 4890.392076 4933.071624
89: 94584.077612 95578.847863 96398.144913
90: 136226.694122 140816.825189 143466.898866
91: 82127.785561 81450.464010 82681.843439
92: 11528.936750 11445.924613 11487.978420
93: 19486.206725 19306.875304 19429.696129
94: 67369.912108 68563.368559 69130.315950
95: 172928.020459 175725.841232 176545.667213
96: 11004.470112 10905.679033 10947.414167
97: 810.542780 798.601247 809.533873
98: 80575.467958 81027.551761 81819.932406
99: 71648.229224 70247.486462 71584.005115
Total: 4858824.932541 4918744.004140 4972999.650346
(Small explanation: there are some test cases where Kmcode obtains a perfect score of 0, while my solution and anta's solution don't — in my case, this is because I never allowed to swap P[N], because on the official test cases K was smaller than the initial number of inversion by a sufficiently high difference)
If you want to get around
stdc++.h
, just replace it by every reasonable include you'd expect a solution to need. It's just a collection of libraries. But every modern g++ should have it.Also, pastebin or
<br></br>
tags to save post space.By the way, if anyone is interested, ceilks analyzed some parameters of the official test cases and it seems that the values of D seem to be more like <=10, than <=100 (as mentioned in the problem statement), which would explain why the strategy I mentioned (trading off some inversions for a larger squared sum of distances) has no effect on the official test cases. You can read more about it in the Comments section of the problem
Very interesting.(I think I submitted many solutions for this problem, but how did you find my best solution?)
As for me, I used only hill climbing and modulated some random numbers to get the best score for ONLY the testcases of this contest ,so it doesn't work so good for another testcases I think.
And as for anta , according to his twitter https://twitter.com/anta_prg/status/670854615156543489 , he said that he used the algorithm like "Breadth-first beam search".
Hm.. I think I missed taking your best solution. I saw that your final score was 99.977, so I went through your last few submissions until I found one which rounded the same as this (since they only show the score for each submission with 2 decimal places). But I think I picked one whose rounded score was 99.97, instead of 99.98 :( Sorry for that. The individual submissions also show the total absolute score (if you hover the mouse over the right place), but using that would have been too time consuming.
So, obviously, the experiments I did were somewhat incomplete. It's possible that some other submission of each contestant (not the best scoring one on the official test cases) might score better on my local test cases. My point was just to point out that the official test cases missed some important cases, which show up easily when generating random test cases. As explained in another comment, this seems to be because the D values used in the official test cases were much smaller than the ones mentioned in the problem statement (more like D<=10 instead of D<=100). And this issue of having test cases generated by some secret strategy shows up again and again for challenge/approximation problems on platforms like Hackerearth or Codechef (only the Topcoder Marathon matches are professional enough from this perspective).
Guys, four standard problems and strange challenge with optimising <1% of score. That's not how I like to spend my Saturday evening. I think there was lack of hard algorithmic problem with interesting subtasks. Also I think it's a good idea to write about subtasks scores in statements.
What what intended solution for the challenge problem?
how to solve Cats Substrings using suffix automaton? (if possible)
Cats substrings is 452E - Три строки with slight variations.