mogers's blog

By mogers, history, 9 years ago, In English

The round 1 of Facebook Hacker Cup 2016 starts in a little over 4 hours at 10am PST.

Everyone who solved at least one problem in the qualification round (CF post) is eligible to compete. Click here to enter Round 1.

Note that advancement criteria to round 2 was changed. The criteria now is to achieve at least 30 points in Round 1, instead of same points as 500th. These changes were published in the competition FAQ.

  • Vote: I like it
  • +92
  • Vote: I do not like it

»
9 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Moving it into recent actions. Contest will start in 3 minutes.

»
9 years ago, # |
Rev. 2   Vote: I like it +92 Vote: I do not like it

I already hate laundry and counting number of zeroes in constraints :)

»
9 years ago, # |
  Vote: I like it +14 Vote: I do not like it
»
9 years ago, # |
  Vote: I like it -37 Vote: I do not like it

Could someone please explain the first sample test of Yatchzee? The only step costs 2 dollars, meaning the amount of money left will never exceed 1 dollar, how can the expected value be 1.666666667?

  • »
    »
    9 years ago, # ^ |
    Rev. 3   Vote: I like it 0 Vote: I do not like it

    The only step costs 2 dollars, meaning the amount of money left will never exceed 1 dollar,

    Why?

    EDIT: You people have no respect for the Socratic Method.

    • »
      »
      »
      9 years ago, # ^ |
        Vote: I like it -24 Vote: I do not like it

      The statement said that he will stop when he can't afford the cost of the next step, so if he's left with 2 dollars or more, he will continue paying.

  • »
    »
    9 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Oh I see it now. The initial money is a real value.

»
9 years ago, # |
  Vote: I like it -147 Vote: I do not like it

For problem Laundro,Matt which data structure should I use? Can anyone give any suggestions?

  • »
    »
    9 years ago, # ^ |
      Vote: I like it +106 Vote: I do not like it

    NO

  • »
    »
    9 years ago, # ^ |
    Rev. 2   Vote: I like it +25 Vote: I do not like it

    That's against the rules. We'd be disqualified for discussing the problems before the contest ends.

    • »
      »
      »
      9 years ago, # ^ |
        Vote: I like it -74 Vote: I do not like it

      Sorry!!I did not know that.

      • »
        »
        »
        »
        9 years ago, # ^ |
          Vote: I like it +47 Vote: I do not like it

        Do you usually discuss questions during the exams with your friends in your school (or when you were at school) ?

»
9 years ago, # |
  Vote: I like it +27 Vote: I do not like it

why they have this one time 6 minutes rule ? it's so annoying for countries with crappy internet. I couldn't download input file fast enough and it's expired now :| I sent an email to them, and they responded :

"Unfortunately, we cannot extend the time limit — it's your responsibility to be able to download the input in time (we make sure it's not too large)."

it's just so disappointing!

  • »
    »
    9 years ago, # ^ |
      Vote: I like it +131 Vote: I do not like it

    An idea for organizers: encrypt the input (ZIP should be ok I think), make tests downloadable at any moment and show password on request (which should also start countdown).

    Same thing works for uploading: make user upload hash of their output (probably, with some salt so sharing hashes does not help) under 6 minutes and then allow more time to upload actual file.

    • »
      »
      »
      9 years ago, # ^ |
        Vote: I like it +15 Vote: I do not like it

      Hashing output have a few drawbacks, since hashing is very sensitive to small changes:

      • Lower/Upper case: case #1:

      • Forgot #: Case 1:

      • Precision: Case #1: 1.2345678910

      • Trailing spaces: Case #1: 1 2 3 4 (trailing SPACE)

      Getting an encrypted input beforehand would be nice though.

      • »
        »
        »
        »
        9 years ago, # ^ |
          Vote: I like it +42 Vote: I do not like it

        In case they use smart checker which can handle all these things — you can ask about sending hash of your output during 6 minutes timespan, and then uploading your output (which should have same hash) in next X hours.

        On the other side — output files are small, so it doesn't make much sense, unlike giving encoded input beforehand.

        • »
          »
          »
          »
          »
          9 years ago, # ^ |
            Vote: I like it +32 Vote: I do not like it

          Yeah, I like the idea of uploading a hash, and then allowing arbitrarily long upload times.

          • »
            »
            »
            »
            »
            »
            9 years ago, # ^ |
              Vote: I like it +18 Vote: I do not like it

            (though usually input is by far the larger file)

            • »
              »
              »
              »
              »
              »
              »
              9 years ago, # ^ |
                Vote: I like it 0 Vote: I do not like it

              I really hope that yeputons's idea is applied in the next round.

              • »
                »
                »
                »
                »
                »
                »
                »
                9 years ago, # ^ |
                  Vote: I like it +36 Vote: I do not like it

                Not next round, but potentially next year.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  9 years ago, # ^ |
                    Vote: I like it +64 Vote: I do not like it

                  That's a good idea, but overcomplicated for those who have fast internet access (unzip inputs and getting hashes for all tasks can be annoying). Maybe you can keep both the current format and yeputons's idea?

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  9 years ago, # ^ |
                    Vote: I like it +15 Vote: I do not like it

                  Yeah, this is a minority use case. We'd want both formats.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  9 years ago, # ^ |
                    Vote: I like it 0 Vote: I do not like it

                  One solution is like this:

                  For getting input, we can write test case generator in javascript — when you click 'Download data', the server will send the js code, your browser will run it then get an input file. Nothing will look different from current workflow, only backend changes.

                  For uploading output, we can keep the current UI, when you click 'Upload Output', the browser first send hash to server, then try to upload the actual file, if failed — then it will show 'Please upload output file with hash xxxxxxxx'. So if your internet is good, then nothing will be changed.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  9 years ago, # ^ |
                    Vote: I like it +12 Vote: I do not like it

                  I wish you all the best with generating 10MB file with JavaScript in browser. Especially if generator is not that simple.

                  Also, that would expose generating strategies to participants, which is not the best option. E.g. there are some problems where you have to find, say, Hamiltonian cycle in a graph. Exposing generator is a really bad idea here.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  9 years ago, # ^ |
                    Vote: I like it 0 Vote: I do not like it

                  You are right, there may be performance issue for some cases.

                  For your second concern, we can uglify that code, then it will be very hard to get some useful info under 6 minutes.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  9 years ago, # ^ |
                    Vote: I like it 0 Vote: I do not like it

                  I guess by uglify you mean to obfuscate/minify? the code, in this case you can go to one of the many online tools to prettify it, there are many algorithms easily identifiable seeing code structure, I think not many people can do something useful with 6 minutes but I'm sure is not impossible.

  • »
    »
    9 years ago, # ^ |
      Vote: I like it +5 Vote: I do not like it

    why they have this one time 6 minutes rule ? it's so annoying for countries with crappy internet

    it's fun because AFAIK there is special initiative for facebook employees "emulate crappy internet on your workplace", exactly for such cases. I suspect hackercup developers opted to avoid it

»
9 years ago, # |
  Vote: I like it +25 Vote: I do not like it

I'm wondering why the constraints are written as T  ≤  maxt while actually it's always T = maxt :P

  • »
    »
    9 years ago, # ^ |
      Vote: I like it +73 Vote: I do not like it

    In sample inputs T < maxt.

»
9 years ago, # |
  Vote: I like it -9 Vote: I do not like it

Может кто-нибудь из "прошаренных" выложить свои входные/исходные файлы?

Lets share input/output files!

»
9 years ago, # |
  Vote: I like it +4 Vote: I do not like it

I was lucky that my C passed the tests. I had a potential overflow bug in a very specific case. If Mike adds the problems to the gym, you might want to add the following test case:

2 0 1000000000
1 500000

Output: 249999.002001000

»
9 years ago, # |
  Vote: I like it +11 Vote: I do not like it

The test cases are very bad

For C Yachtzee most Ci are less than 1000 and does not contain boundary cases at all. For example

2 999999999 1000000000
999999999 1

Expected answer is 0.5 while many people, if using double instead of long double and simply using (f(b) - f(a)) / (b - a), will output 0

and for D i don't know why they have to repeat N = 2 so many times and there is not even a single 9 9 for N = 16

  • »
    »
    9 years ago, # ^ |
    Rev. 2   Vote: I like it +31 Vote: I do not like it

    I completely agree. The test cases are extremely weak and certainly not fair to everyone. In problem B, a lot of people complained that the input contained N > 10000 (before they changed the constraints). The input file I received had N < 1000 for all cases. I know that the inputs are generated randomly and are different for every contestant, but what kind of randomly generated input contains N < 1000 for some and N < 10^5 for some others, considering all 50 cases? And how exactly do you call this fair?

    Exact same problem repeats in C. It was mentioned that Ci <= 10^9, but in my input file, all Ci < 1000. Even apart from this, the test cases for C are very weak. Already a lot of contestants are mentioning the cases for which their solution fails in the comments. Not to mention those tricky cases where not using long double will lead to precision error for some contestants.

    Please take care of these issues in the next round. I can accept somewhat weak test cases, but the input file of two contestants shouldn't differ so significantly in their constraints to ensure a fair competition.

  • »
    »
    9 years ago, # ^ |
      Vote: I like it +8 Vote: I do not like it

    Can you please explain why we need long double? I mean can't double handle 10^18? and only time here precision comes is when we multiply it segment values with 0.5, and finally divide by b-a. In the whole process,there is no extra division, so I'm confused why double won't work. I used only double, and it passes the cases you guys mentioned here :S

    • »
      »
      »
      9 years ago, # ^ |
      Rev. 2   Vote: I like it +11 Vote: I do not like it

      double contains only 52+1 significant bits.

      • Between 9007199254740992 — 18014398509481984 only multiples of 2 can be accurately represented
      • 18014398509481984 — 36028797018963968: multiples of 4
      • 36028797018963968 — 72057594037927936: multiples of 8
      • 72057594037927936 — 144115188075855872: multiples of 16
      • 144115188075855872 — 288230376151711744: multiples of 32
      • 288230376151711744 — 576460752303423488: multiples of 64
      • 576460752303423488 — 1152921504606846976: multiples of 128
    • »
      »
      »
      9 years ago, # ^ |
        Vote: I like it +5 Vote: I do not like it

      I don't mean it will fail everyone. I just mean it will fail some people who are careless enough. And it's the organizer's responsibility to catch those people.

      • »
        »
        »
        »
        9 years ago, # ^ |
        Rev. 2   Vote: I like it +5 Vote: I do not like it

        Thanks a lot for the clean explanation :) So when numbers are floating point,instead of integers, the range double can accurately represent decreases,right? because if a 0.5 is added to (2^53)-1, it is no longer accurately represented, 0.5 takes some bits.

        Does long double contain 63 significant bits?

        • »
          »
          »
          »
          »
          9 years ago, # ^ |
            Vote: I like it +10 Vote: I do not like it

          Yes.

          long double contains 64 significant bits and therefore can store any long long value accurately

»
9 years ago, # |
  Vote: I like it 0 Vote: I do not like it

How to solve Laundro, Matt?

  • »
    »
    9 years ago, # ^ |
      Vote: I like it +6 Vote: I do not like it

    We want to finish washing the laundry as soon as possible. So we put the possible finishing times for the washing machines in a priority queue.

    Then, we process the L items and assign each one of them to the earliest possible finishing time. This is just the minimum in the priority queue. Suppose this minimum time is X. Then, this washing machine is able to finish washing another item at time X + WashTimeOfThisMachine.

    We can use the same reasoning for the driers, but it's not necessary. We will put an item to dry asap as well. This is X + D is there is a drier available, or D past the next available drier time. If we are processing item i, the next available drier is available at time WashTime[i-M] + D.

    • »
      »
      »
      9 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      L=6,N=3 W=[4,4,7]

      Priority queue passes this test?

      • »
        »
        »
        »
        9 years ago, # ^ |
          Vote: I like it +5 Vote: I do not like it

        Yes, why wouldn't it?

        • »
          »
          »
          »
          »
          9 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          A simple priority queue will give this ordering — w1 w2 w3 w1 w2 w3, but the optimal ordering is w1 w2 w3 w1 w2 w1. So, I guess, some extra logic was to be applied to priority queue :)

          • »
            »
            »
            »
            »
            »
            9 years ago, # ^ |
              Vote: I like it +5 Vote: I do not like it

            No, that ordering would be achieved if we just sorted the wash times once. When we pop the minimum from the priority queue, we add the new available time of this machine, which is X + WashTimeOfThisMachine. The priority queue takes care of putting this new element in the right place so that we keep getting the minimum.

    • »
      »
      »
      9 years ago, # ^ |
        Vote: I like it +8 Vote: I do not like it

      I don't understand the last formula. Could you elaborate on that?

      • »
        »
        »
        »
        9 years ago, # ^ |
          Vote: I like it +12 Vote: I do not like it

        All driers spend the same amount of time D to dry an item. So they are indistinguishable and we will obviously put the first M washed items in the M driers.

        Note that we have the order of the times the items finish getting washed because we are getting the minimums from the priority queue. WashTime[i] < WashTime[i+1].

        So the first item will be washed and dried at time WashTime[0]+D. The second at WashTime[1] + D and the Mth at WashTime[M-1]+D.

        Since WashTime[i] < WashTime[i+1], WashTime[i]+D < WashTime[i+1]+D.

        When the (M+1)th item finishes washing, we will put it in the first drier because it is the one that is available first. The first drier is available at time WashTime[0]+D. So we either put item M+1 there at time WashTime[M] or WashTime[0]+D, whichever is greater.

  • »
    »
    9 years ago, # ^ |
      Vote: I like it +1 Vote: I do not like it

    I solved it using binary search to find the minimum time to wash everything. If this minimum time is X then I divide X by wi for all w to get the maximum loads to be inserted in a washer and store these in a vector. Then I sort them, and choose the first L values after sorting. If p=(L-1)%m, then this is the dryer number which will have the last load. Thus, I calculate minimum drying time of last load by operating on p-dryer for L/m times using formula time=max(current dryer time, load's wash time).

  • »
    »
    9 years ago, # ^ |
    Rev. 2   Vote: I like it +5 Vote: I do not like it

    I am just curious.
    Does somebody find the self-documenting style of code clear and intuitive?
    I try to write my code in that style, so it can serve a higher purpose and help someone struggling with thier's understanding (rather than just solve a problem). But recently I've heard an opinion that I am wrong and self-documenting code is more confusing than clear.

    For example, what do you people think of the solution to the problem.
    Do you find it clear or confusing?

    • »
      »
      »
      9 years ago, # ^ |
      Rev. 2   Vote: I like it 0 Vote: I do not like it

      Only сoach could see your submission. Use ideone/pastbin. i think it is clear but too slow for competing.

      • »
        »
        »
        »
        9 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Thank you, I didn't know about that.

        As for the writing speed, I think that it shouldn't be a concern unless the coding speed is the bottleneck. I think that the first priority should be the clarity of a thought process and the clarity in the writing style could help in that endevour.
        But I may be wrong, that is why I am seeking for a feedback :)

        • »
          »
          »
          »
          »
          9 years ago, # ^ |
            Vote: I like it +5 Vote: I do not like it

          Look at the Petr submissions. Almost all of them are in self-documenting style. I enjoy reading his code and who can say that he is slow? :)
          One more thing, using short meaningless variables leads to hard debugging process, especially for big code. Also, you type large name only once, further intellisense helps you when you use some IDE. I hope I'll also use the same style soon.

    • »
      »
      »
      9 years ago, # ^ |
        Vote: I like it +5 Vote: I do not like it

      For example, what do you people think of the solution to the problem.
      Do you find it clear or confusing?

      completely unreadable. Just use variable names introduced in problem description and i, j, k for array indices. Use comments if you need to clarify something.

      And, to be honest, nobody cares about your code unless you're red

»
9 years ago, # |
  Vote: I like it 0 Vote: I do not like it

When the round 2 will starts ?

»
9 years ago, # |
  Vote: I like it +21 Vote: I do not like it

I thought and coded for C and took me about 12 hourse -_- Then after I submitted 3 minutes before the contest end, I found my solution will give wrong ans for the following test case (and it will take two more lines to fix it)

1 2 4 2

I was devastated. But a while ago, I found out it had passed the test :)

Well, I felt after 12 hours of nonstop thinking, I at least deserved something.

»
9 years ago, # |
  Vote: I like it +44 Vote: I do not like it

The contest is in Gym: 2016 Facebook Hacker Cup, Round 1.

»
9 years ago, # |
Rev. 2   Vote: I like it +41 Vote: I do not like it

If anyone is interested, here is some statistic on the number of advancers:

2298 — 35 points (by new rules)

560 — 80 points(by old rules)

»
9 years ago, # |
  Vote: I like it +42 Vote: I do not like it

What the heck was going on with tests to B???

When I was downloading input, there was one announcement, that L<=1e6 not L<=1e5, but only this one (at that time n<=1e4). There was also constraint that m<=1e9. In test I download in all cases n,m<1e3 holds, seriously? Technically, this test fulfills all constraints, however cases with constraints at least close to their maximal value given by constraints should be present!

What is more, after some time constraint for n was enlarged to 1e5 and my friend got test with n=1e5 in his test (and I got all of them <1e3...).

In this particular task, I think that enlarging those constraints weren't that important, but I don't see any explanation for such behavior of organizers (both enlarging constraints during contest and not putting Maximal values of variables in tests).

  • »
    »
    9 years ago, # ^ |
      Vote: I like it +16 Vote: I do not like it

    Yeah, the data weakness was my fault. To be clear, the bounds weren't changed during the contest, in the sense that people were already receiving files with the higher bounds (I just typo'd the number of zeroes in the constraints). Additionally, the large cases weren't forcibly included in everyone's input files.

    I'm sure there are a few people who wrote a slow solution but got lucky with a bad input file. Our precedent is to let those solutions stand, especially in a round with no fixed number of advancers. You can be sure I'll be quadruple-checking the Round 2 data.

    • »
      »
      »
      9 years ago, # ^ |
        Vote: I like it +75 Vote: I do not like it

      Good to hear that you will carefully check constraints next time, however I am still not happy about the statement "Additionally, the large cases weren't forcibly included in everyone's input files." and as far as I understood you weren't referring to this. This probably holds for all other types of cases. I don't know what is the exact algorithm of choosing which tests to include, but you should ensure that whatever set of testcases is chosen it should test every possible aspect of solution. It is unacceptable that it is possible that guys A and B both have written solutions with significantly too slow time complexities or solutions which fail for an edge case n=1 and one of them will pass test and the second one will not because first guy got lucky with set of testcases he downloaded.

      • »
        »
        »
        »
        9 years ago, # ^ |
          Vote: I like it +14 Vote: I do not like it

        Swistakk,

        I fully agree. The way I would do it is to have several groups of tests (corner cases, maximum constraints, tricky cases, randomized test). Then sample several tests from each group, mix them in randomized way and use for testing. Sampling tests from the whole population (without grouping) is not fair.

        • »
          »
          »
          »
          »
          9 years ago, # ^ |
            Vote: I like it +16 Vote: I do not like it

          Yeah, this is exactly how we've set things up. The problem is that all of the tests were accidentally placed in the random bucket :)

»
9 years ago, # |
  Vote: I like it 0 Vote: I do not like it

How to solve problem C-Yachtzee ?

  • »
    »
    9 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    My approach: find cumulative sum of the costs. then for each cumulative sums from last to first find the disjoint sets of modulos that can appear. Actually there can be 3 types of them call them L, M , R and they are set by the total cost of all the steps.

    take the following test case:

    2 9 20

    8 2

    the total cost is 10

    the first number that is divisible by 10 after or equal to A=9 is 10

    so L is [9,10)

    the first number that is divisible by 10 before or equal to B=20 is 20

    So R is [20,20]

    and M is the range between L and R [10,20)

    So if we modulo it by 10 we can get the following ranges:

    1) From L: [9,10) 2) From M: [0,10) 3) From R: [0]

    However notice that the modulo range those are above or equal to 8 can be spent by the first cost as the programmer greedily spends his money from first to last step

    So the ranges are further edited.

    1) From L: [1,2) 2) From M: [0,8)[0,2) 3) From R: [0]

    From here you can calculate the expected value by calculating the length and midpoint of each ranges

»
9 years ago, # |
  Vote: I like it +45 Vote: I do not like it

I guess I got lucky :) Only submitted problem D solution which is basically wrong: start with identity permutation from 1 to N, and in every step swap a random element from the first half with one from the second half (doing this 2000000 times). It passed the tests.

http://attachment.fbsbx.com/hackercup_source.php?sid=960450640657968

  • »
    »
    9 years ago, # ^ |
      Vote: I like it +9 Vote: I do not like it

    That's really interesting :)

    There aren't that many distinct tournament trees due to symmetry, so I guess your solution has a high probability to find the right answer in all tests.

    • »
      »
      »
      9 years ago, # ^ |
        Vote: I like it +11 Vote: I do not like it

      It's interesting what's worst possible input for this solution.

      Best one I come up with is this (#1 needs a lot of luck to win this one). Still random is good enough to solve it.

  • »
    »
    9 years ago, # ^ |
      Vote: I like it +8 Vote: I do not like it

    If even in Google Codejam finals tomek ALMOST got away with using simulated annealing... Why not? :)

  • »
    »
    9 years ago, # ^ |
    Rev. 3   Vote: I like it +5 Vote: I do not like it

    Even I did it in the same way! but my output failed.
    For N = 1, 2, 4, 8 we can just iterate over all permutations. Which fits very well in the 6 minutes time.
    But for N = 16, I used random_shuffle() doing around 70000 iterations. But sadly my output failed for 1 test case. Where as it worked for 80000 iterations. :(

  • »
    »
    9 years ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    This random has a bigger chances)))(отжиг)
    http://codeforces.net/gym/100875/submission/15430079

»
9 years ago, # |
  Vote: I like it 0 Vote: I do not like it

I almost qualified ;_;

One teeny tiny error held me back this time :(

  • »
    »
    9 years ago, # ^ |
    Rev. 4   Vote: I like it 0 Vote: I do not like it

    Don't worry, you aren't alone ))): http://codeforces.net/gym/100875/submission/15422585

    /*because of the absence of this condition (it should to be after the cycle ALSO, but not just in the cycle), my sol was wrong and i earned JUST 25 points*/ 
                    if (cnt > 4)
                        cnt = 1;
    
    • »
      »
      »
      9 years ago, # ^ |
      Rev. 2   Vote: I like it 0 Vote: I do not like it

      I wrote

      if(m>=n)

      It should've been

      if(m>=l)

      I would've cried when I realized this if I was alone in my room at that time. But thanks for your consolation. It is always better to know you aren't the only one :)

»
9 years ago, # |
  Vote: I like it 0 Vote: I do not like it

For problem D, I wrote a backtrack + BPM solution. It seemed like a good idea when I was writing it :p

»
9 years ago, # |
  Vote: I like it 0 Vote: I do not like it

I was really confused to see so many people solving 2-3 problems. Though I thought the problems were standard. It is really weird to see almost 2000 people qualifying for the round 2. Now it makes sense after reading all the discussions above. :|