eatmore's blog

By eatmore, history, 4 years ago, In English

Google Code Jam 2021 has been announced. Here is the schedule:

The location of the finals will be announced later.

The dates of other Google competitions, Hash Code and Kick Start, has also been announced.

  • Vote: I like it
  • +118
  • Vote: I do not like it

| Write comment?
»
4 years ago, # |
Rev. 2   Vote: I like it +82 Vote: I do not like it

congratulations gennady for winning

»
4 years ago, # |
  Vote: I like it +18 Vote: I do not like it

Will their be system testing after wards?

  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    A few of the problems have hidden testcases that will be revealed after the contest. (In particular, the second and fourth problem.)

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      If i fail at that hidden test case will i get any score as i have passed test1 and test2.

      Extra test will be added or not?(like codeforces)

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Please provide the setting (if possible) to change tabs character width, I like the editor otherwise. Of course, I don't complain that it doesn't support snippets :p

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

I think the hidden 1 point is the most expensive point to get for the amount of extra work required. Agree?

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +29 Vote: I do not like it

    Somewhat agree, but there is actually a nice solution to P2 that works for all subtasks without any extra work at all :)

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +16 Vote: I do not like it

      Yeah I didn't even know there was a greedy idea that worked only for positive weights till I opened the comments here.

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +3 Vote: I do not like it

    I didn't get a solution that works for the P2 but doesn't work for the hidden 1 point

    • »
      »
      »
      4 years ago, # ^ |
      Rev. 4   Vote: I like it -117 Vote: I do not like it

      Deleted

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it +25 Vote: I do not like it

        OMG, contest is still running, don't give any hint. Delete this reply now.

        • »
          »
          »
          »
          »
          4 years ago, # ^ |
            Vote: I like it +1 Vote: I do not like it

          I thought the discussion of the qualification round was allowed?

          Google Code Jam Rules state that

          Sharing information with other contestants is permitted in the Qualification round only. Notwithstanding Section 7.1(E) of the Terms, you will not be disqualified for using information from or sharing information with others about problems during the Qualification Round of Code Jam.

          • »
            »
            »
            »
            »
            »
            4 years ago, # ^ |
            Rev. 2   Vote: I like it +20 Vote: I do not like it

            It is permitted, but it sort of diminishes the fun of the contest imo.

»
4 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

:)

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

How many point do we need to qualify? Is it decided after?

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +1 Vote: I do not like it

    30 points. It is mentioned in the rules and FAQ. You can have a look at that.

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

I'm getting this error in interactive problem on my laptop: "judge: Case #71 failed: Couldn't read a valid line". I checked and it's actually EOFError. I submitted my solution anyway and it got 18 pts (as of right now), so I guess it's correct, more or less. So why such error?

»
4 years ago, # |
  Vote: I like it +13 Vote: I do not like it

Are you sure that the start and end time for qualification round are correct?

The end time in the post is after ~15 hour. While Codejam page shows that the remaining time is only ~8 hours.

Please check this and update the post if there is an error. Some people may miss the round due to that.

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +8 Vote: I do not like it

    According to Code Jam Rules 3.1,

    Code Jam will start with the online Qualification Round on Friday, March 26, 2021 at 13:00 UTC and run for thirty (30) hours, ending on Saturday, March 27, 2021 at 19:00 UTC.

    So this post writes a wrong start and end time of qualification round.

    By the way, schedule of other rounds are consistent with the rules and schedule page of Code Jam.

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +30 Vote: I do not like it

    Google changed the schedule of the qualification round after this post was published. I updated the post.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it -44 Vote: I do not like it

      Sir, please extend the deadline for Google Codejam, I registered for the event and didn't got any mail and thus missed the round, now just 7 minutes left and I saw this post on Codeforces and then came to know that the round is almost over, please extend it now. Please

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it +14 Vote: I do not like it

        Bruh, it doesn't work like that. It is the world's most prestigious competitive programming championship, not gully cricket.

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        How did your extended Google Codejam round go? :P

»
4 years ago, # |
Rev. 2   Vote: I like it +4 Vote: I do not like it

I think Google might need to apply a cheating detection algorithm to the Cheating Detection problem

  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    To be fair, Cheating Detection is solvable by even someone with no competitive coding background if they think about the Math behind it for some time and just try out ideas that seem like they might work.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +8 Vote: I do not like it

      I agree, however the pattern of submissions in the last 2-3 hours was pretty suspicious. Literally hundreds of new entrants submitting all 5 questions at the same time, in a very short time frame.

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +23 Vote: I do not like it

    Note that discussing solutions is actually allowed for this round:

    7.1 Qualification Round. Sharing information with other contestants is permitted in the Qualification round only. Notwithstanding Section 7.1(E) of the Terms, you will not be disqualified for using information from or sharing information with others about problems during the Qualification Round of Code Jam.

    https://codingcompetitions.withgoogle.com/codejam/rulesandterms

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Very interesting! Thanks for sharing.

»
4 years ago, # |
Rev. 4   Vote: I like it +17 Vote: I do not like it

I've no proof to my solution of Cheating Detection but I did something which looked good to me and it worked :)

I took all the players having $$$>=5000$$$ score and sorted them in non increasing order of the solves. Assuming no cheating, I thought the adjacent players will have less mismatches because of similarity in value of $$$S$$$. Mismatch in a particular question is when score of player $$$i$$$ is different from score of player $$$j$$$. Calculated mismatch for all players. For players which are not on the end points, I took the average value.

I defined a function $$$f_i = a \cdot mismatch_i + b \cdot solves_i $$$.

I took the player having the maximum $$$f$$$ because there is good chance for the cheater to have good amount of solves and good amount of mismatches from its adjacents.

Tried bunch of values for $$$a$$$ and $$$b$$$ and it passed both sub-tasks with $$$a=5$$$ and $$$b=4$$$.

Solution (bit messy)

Sadly, I missed $$$3rd$$$ sub-task of Moons and Umbrellas and couldn't get perfect score. I should've tested it properly :(

»
4 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

How to "prove" the full solution to P5? I just tried an intuitive idea hoping it would pass Test Set 1 and got AC on both Test Sets.

I'm assuming there are several different approaches that works, so I'm going to roughly state my solution:

Let the easyness of a question be the number of people who solve it.

Then I just select the participant with the max variance of easyness over problems they solve.

I think its pretty intuitive why this is a fairly decent metric, but how do you even begin to prove that it will work on average at least 86% of the time with random data?

Was there some other idea that was easy to prove?

Code
»
4 years ago, # |
Rev. 2   Vote: I like it +26 Vote: I do not like it

Problems solutions:

A
B
C
D
E
  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Similar in E, divide success rate in 10% hardest questions by overall success rate and whoever has the highest is the cheater.

  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    In C's solution

    "If you reversed from last to beginning, you can reverse it again from beginning to last."

    Can someone please explain it a bit more

    Thanks!

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      For example, if we have initial array:

      1,2,3,4,5

      If we reversed 4th index till last:

      1,2,3,5,4

      If we reversed 3rd index till last:

      1,2,4,5,3

      We have an array of: 1,2,4,5,3

      Now, if we reversed from 3rd index till last:

      1,2,3,5,4

      If we reversed from 4th index till last:

      1,2,3,4,5

      We returned to the original array. Meaning, that this operation, is reversible so the problem turned from finding an array with C to find how many reverses you need to reach C(so it became really much easier like the first problem).

»
4 years ago, # |
Rev. 2   Vote: I like it +45 Vote: I do not like it

Thanks for the round! I think my E solution is considerably stronger than intended--I generated a suite of 2000 test cases locally and found, rather shockingly, that it had 100% accuracy on the sample. I also tested locally and found that it has 100% accuracy on the actual input.

For anyone curious, here's my approach. I sort the problems by solve counts and split them into 50 buckets of 200 problems each. Then, I assume the difficulty of problems in the i'th easiest bucket (using 0-indexing) is $$$-2.94 + 0.12i$$$. (In other words, I assume the buckets cover evenly distributed subintervals of $$$[-3, 3]$$$ and that the difficulty of each problem is approximated by the average difficulty in its bucket.) For each contestant, I iterate over all possible skill levels of the form $$$-3 + 0.06i$$$ (assuming that this will be a reasonable approximation for their actual skill), and I compute an error function (the sum of the squares of expected solve count minus actual solve count over the fifty buckets) for each player-skill level combination if the player is cheating and if they are not cheating. Then, I compute each player's minimum error function if they are cheating and minimum error function if they are not cheating, and I output the player who minimizes the quotient when the error function if the player cheats is divided by the error function if the player does not cheat.

Here's my code: https://pastebin.com/4Vy70PRg

  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Wow, that distribution is way more accurate than me just getting mean values which still has 99.9% success rate(10 failed on 5K cases).

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +8 Vote: I do not like it

      I'm impressed that you're able to achieve such high accuracy with a relatively simple approach--thanks for sharing your solution!

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +34 Vote: I do not like it

    If your success rate on 50% is very high, you can try evaluating accuracy when the cheater only cheats 10% of the time or something. My solution gets 1000/1000 with a 50% cheater, and about 80% accuracy on a 10% cheater. I bet you can do better though.

    Here's my approach: The "theoretically correct" solution is to directly apply Bayesian analysis and evaluate a 10100-dimensional integral of conditional probabilities, because you have a complete set of priors of the inputs. This seems pretty infeasible (though maybe someone knows how?), so one approximation you can take is assuming that the Bayesian posterior of the people/problem ratings are all very sharp peaks, and you just try to estimate the location of the peak. You can do this iteratively by alternating optimizing people-ratings and optimizing problem-ratings. Once you've estimated the maximum-likelihood ratings, you can then take your conditional probabilities: Bayes' theorem says that you should rank people based on the ratio of $$$P(given-scores | cheating) / P(given-scores | not-cheating)$$$, which is a ratio of sigmoids.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      My solution essentially has the same idea as yours, so I've tried my solution on a 10% cheater as you suggested. Although I get 100% accuracy on a 50% cheater, the result is around 5%.

      Can you share more details on how do you optimize people and question ratings? How many iterations do you do? I could afford only 10, otherwise, I got TLE.

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        I actually only do 1 iteration; I initialize the people ratings by taking the inverse of the sigmoid CDF of their solve rate (assuming questions are uniformly distributed), and then I ternary searched to find optimal problem ratings.

        Did you change your constants in the Bayes' rule evaluation with the updated cheat percentages?

        Code here: https://ideone.com/nRYKXS

        • »
          »
          »
          »
          »
          4 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          You were right, I didn't change the constant in Bayes rule... Now the accuracy on 10% cheater is around 65%, which is not that bad, but far from your result.

          My solution is substantially a log-likelihood gradient ascent, starting from same yours initial values. Probably with a better tuning of the learning rate and squeezing more the time limit to make more steps I can improve the result, but definitely ternary search is better.

          Thanks!

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +25 Vote: I do not like it

      I used this Bayesian approach but with a simple one-step approximation for question difficulties like Geothermal. The algorithm is what Geothermal described except you replace the sum-of-squares error function with a log-likelihood. On 5000 test cases with a 10% cheater it passes 4156.

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

How to solve problem D for hidden test case ?

I used binary search and number of iterations will be around $$$log2(3)+log2(4).....+log2(50)$$$ in worst case which is >200 but since data in input are randomly made , I think average could be less and thus it might pass the hidden test case .

Also when hidden test case verdict will be out ?

  • »
    »
    4 years ago, # ^ |
    Rev. 2   Vote: I like it -31 Vote: I do not like it

    What you did is binary search .but if think carefully you'll find that actually you can split list into 3 parts at the same time. Here's my code in python3

    t, n, q = map(int, input().split())
    for i in range(0, t):
        done = 3
        print(1, 2, 3)
        y = int(input())
        if (y == 1):
            x = 2
            z = 3
        elif (y == 2):
            x = 1
            z = 3
        else:
            x = 1
            z = 2
        ans = [x, y, z]
        next = 4
        f = 0
        while (next < n + 1 and f == 0):
            start=0
            e = 0
            end = len(ans) - 1
            while (e == 0):
                
                a=start+(end-start+1)//3
                mid = (a + end) //2 + (a+end)%2
                # print(a,mid)
                # print(ans)
                print(ans[a], ans[mid], next)
                y = int(input())
                if (y == -1):
                    f = 1
                    break
                elif y == ans[a]:
                    end=a-1
                    if(start-end==1):
                        ans.insert(start,next)
                        e=1
                    if(start==end):
                        end+=1
                elif y == ans[mid]:
                    start=mid+1
                    if(start-end==1):
                        ans.insert(start,next)
                        e=1
                    if(start==end):
                        start=mid
                else:
                    start=a+1
                    end=mid-1
                    if(start-end==1):
                        ans.insert(start,next)
                        e=1
                    if(start==end):
                        start=start-1
                    # if (mid - a == 1):
                    #     ans.insert(mid, next)
                    #     e = 1
            if (f == 1):
                break
            next += 1
        if (f == 1):
            break
        for i in range(0, len(ans)):
            print(ans[i], end=' ')
        print()
        y = int(input())
        if (y == -1):
            break
    

    Hope it helps

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it -19 Vote: I do not like it

      why is my comment downvoted?? what did i do wrong?

      • »
        »
        »
        »
        4 years ago, # ^ |
        Rev. 2   Vote: I like it +9 Vote: I do not like it

        I would assume it's putting raw code directly in a comment

        Instead of in a Spoiler
  • »
    »
    4 years ago, # ^ |
      Vote: I like it +31 Vote: I do not like it

    Hidden case verdicts are available now. The main idea of D is to solve the problem using insertion sort in $$$O(n \log_3 n)$$$. Essentially, if the position of an element $$$x$$$ to be inserted is bounded within the range $$$[l, r]$$$, query $$$x$$$ with the elements at positions $$$\frac{2l+r}{3}$$$ and $$$\frac{l+2r}{3}.$$$ This will result in a range roughly $$$\frac{1}{3}$$$ as large, achieving the desired $$$n \log_3 n$$$ complexity.

    • »
      »
      »
      4 years ago, # ^ |
      Rev. 3   Vote: I like it 0 Vote: I do not like it

      Binary Search is actually sufficient as long as you ensure that a region of size $$$n$$$ is divided into at regions of size at most $$$\frac{n - 1}{2}$$$ and end the loop if you find the required point partway through.

      I'm not sure how to show this reduces it enough but it passed the hidden subtask and didn't exceed an average of $$$165$$$ queries over a few thousand runs locally with random input. I think the uniform random test generation plays a role in it.

      Code (ignore the unneeded messy endpoint conditions in check that aren't even needed)
    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +17 Vote: I do not like it

      Quicksort with two pivots also worked.

  • »
    »
    4 years ago, # ^ |
    Rev. 2   Vote: I like it +10 Vote: I do not like it

    The search for each new element can be reduced to $$$\log_3(n)$$$ by splitting the array into thirds when you are querying for the place of a new element.

    As a more general tip, remember you are not doing a comparison (i.e. binary operation), your query operation has 3 different possible outcomes, so you should be able to break up the space of possibilities into 3 different partitions on each query.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      gaurav,Geothermal,Kognition log2(n) didn't worked :( . But i understood the log3(n) part .thanks i understood that i should have queried at points l + (r-l)/2 , r — (r-l)/2 for getting 1/3 of length in each iteration .

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        I applied binary search only with some some modifications to decrease queries per iteration and I was able to pass the last sub-task. I tested it locally and it was taking $$$155-175$$$ queries on an average.

        Code
        Code for testing
        • »
          »
          »
          »
          »
          4 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          Thanks for your code , could you please write in words how you modified the binary search ? It's little hard to read the code directly without any written info .

          • »
            »
            »
            »
            »
            »
            4 years ago, # ^ |
            Rev. 4   Vote: I like it 0 Vote: I do not like it

            Here it goes-

            You can easily find the relative position of $$$3$$$ elements with just $$$1$$$ query. Now, I try to place elements from index $$$4$$$ to $$$N$$$ in already existing list one by one.

            Let's say during the current iteration, size of already existing list is $$$l.$$$ I define $$$start=0$$$ and $$$end=l-1.$$$ For insertion of new element, it can have potentially have $$$l+1$$$ positions ($$$l-1$$$ in between the elements and $$$1$$$ before start and $$$1$$$ after end). Define $$$mid=(start+end)/2.$$$

            To find the appropriate position-

            I ask $$$(start,mid,i)$$$ where $$$i$$$ is the index of current element we want to place. You can also ask $$$(end,mid,i)$$$ depending on your implementation.

            Now, we want to analyze how our search space will be reduced.

            1. If median is $$$start,$$$ we can stop our search and directly place $$$i$$$ before $$$start.$$$

            2. If median is $$$mid$$$, we can be sure that position of $$$i$$$ is definitely not before $$$mid.$$$ We can initialize $$$start=mid+1.$$$ This way we reduce the search space by half.

            3. If median is $$$i,$$$ we can be sure that position of $$$i$$$ is definitely not after $$$mid.$$$ We can initialize $$$end=mid-1.$$$ This way we reduce the search space by half. Also, we can increase $$$start$$$ by $$$1$$$ because we are also sure that the position is definitely after $$$start$$$, otherwise the result of the query would have be $$$start.$$$

            Also, I handle the cases when search space reduces to $$$3$$$ or less separately because I don't wanna mess up and end in some infinite loop.

            The total queries it takes is $$$1$$$ + $$$\Sigma^{N-1}_{i=3} (\log_2i-c)$$$ where $$$c \sim 1$$$. That $$$c$$$ is due to above mentioned few optimizations and the fact that input is random.

            Also, I tested it locally and it takes $$$150-175$$$ queries. My initial solution was taking around $$$\sim 250$$$ queries because I was extra cautious and assigned $$$start=mid$$$ and $$$end=mid$$$ in case $$$2$$$ and $$$3$$$ respectively and did $$$1$$$ more query in case $$$1$$$ to confirm.

            • »
              »
              »
              »
              »
              »
              »
              4 years ago, # ^ |
              Rev. 2   Vote: I like it 0 Vote: I do not like it

              Also, you can compare the both codes for better understanding-

              ~250 queries
              ~170 queries
»
4 years ago, # |
  Vote: I like it -8 Vote: I do not like it

I have a nice recursive solution for $$$Median$$$ $$$Sort$$$. Inititially, call the function $$$sort(a[1,2,...,n])$$$. For all $$$3 \leq j \leq n$$$, query $$$a[1]$$$ $$$a[2]$$$ $$$a[j]$$$. From the output of the query, one can determine the position of $$$a[j]$$$ in the array, i.e. $$$left$$$, $$$middle$$$ or $$$right$$$ w.r.t $$$a[1]$$$ and $$$a[2]$$$. After this one can recursively solve $$$left$$$, $$$middle$$$ and $$$right$$$ parts. But the problem now is the three parts can be independently in correct or reversed order. We can determine the order by querying one of the elements of each part along with $$$a[1]$$$ and $$$a[2]$$$. I don't have a strict upper bound on the number of queries this program asks, but it asks about 165 queries per test case (tested locally on random tests).

My submission
  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I've implemented the same solution, except that instead of using the first 2 elements in the array, I choose them randomly. This because it's easy to find test cases where your solution fails to use less than 175 queries, even on average (i.e. array already sorted). I think that if the problem was in a CF round somebody was going to hack you. Probably CJ test cases were weak enough to pass.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Nope. It doesn't matter if you take random elements. Because of this line in the problem statement: The order for each case is generated uniformly at random from all possible orders, and independently of any other information.

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        I've missed that line in the statement, I'm sorry :(

        But still, I think it's important to keep in mind that kind of solution are easily hackable in CF

»
4 years ago, # |
  Vote: I like it +21 Vote: I do not like it

Here's a link to my screencast: https://youtu.be/RThkP1UORhk. It'll be available in HD whenever the HD version finishes processing.

»
4 years ago, # |
Rev. 2   Vote: I like it +19 Vote: I do not like it

Seen lots of good solutions for E.

Here is how I went about the problem — note that the coin toss compresses the sigma function. If we compare a 'clever' person (let's say skill in the top 10%, i.e. > 2.4) with a cheat of average intelligence (0), we find the biggest gap between their probability of success lies in the middle questions.

So what worked for me was to take the top 5% of questions (0-499 in lowest solves list), find the 10% (i.e. 10) of people with the most solves on those questions, and then subtract their solves from the 'middle' 5% of questions (4750-5249 in the same list). Whoever has the biggest difference is the cheat.

clevercheat

»
4 years ago, # |
Rev. 3   Vote: I like it +13 Vote: I do not like it

Question E — For every contestant, I calculated the Pearson correlation coefficient with

  • whether the contestant got the question correct for each question
  • how many contestants got the question correct for each question

I guessed that the cheater is the one with smallest Pearson correlation coefficient. After the qualifiers, I found out that my answer was accepted on the borderline (with 7 mistakes out of 7 allowed mistakes).

Spoiler
  • »
    »
    4 years ago, # ^ |
    Rev. 2   Vote: I like it +5 Vote: I do not like it

    I did the exact same thing, had no idea it was that close to failing :o

»
4 years ago, # |
  Vote: I like it +33 Vote: I do not like it

I share also my solution for problem E, since nobody has discussed this approach yet.

I've got 100% accuracy in problem E by implementing a gradient ascent for finding the maximum likelihood estimation of $$$S_i$$$ and $$$Q_j$$$, ignoring the presence of a cheater. I've then chosen as a cheater the student that increases most the likelihood if the cheater-distribution is used.

Here is the code

»
4 years ago, # |
Rev. 4   Vote: I like it +25 Vote: I do not like it

For E, sort the questions by number of solutions and for each player compute average squared distance (in the sorted question list) for all his answers from his median position.

Seems to work pretty well, 5 errors on 25,000 runs. (99.98% accuracy)

python code
»
4 years ago, # |
Rev. 12   Vote: I like it +10 Vote: I do not like it

Here is a simple solution for E which passed both test cases:

Sort the problems by how many people solved them, and sort the players by how many questions they solved.

Let's say the "Difficult Problems" are the hardest 500 problems.

  1. If there is anyone ranked in the bottom 80 who solved more than 200 Difficult Problems, I say that they are the cheater.
  2. If not, then I say that whoever solved the most Difficult Problems is the cheater.

This solution tries to catch lower-skill cheaters with (1), and catch higher-skill cheaters with (2).

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

In problem D, Median Sort, I just got stuck to one idea ie using a BST, ultimately I couldn't get it right. Can anyone share insights if it's possible to solve it using BST?

»
4 years ago, # |
  Vote: I like it -17 Vote: I do not like it
O(N) solution for C
»
4 years ago, # |
  Vote: I like it +53 Vote: I do not like it

the discussions on problem E makes it sound like it is a hashcode problem: "yeah I don't know how it works, but here is my heuristic"

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

In country standing of codejam, Iran is missing. There are strong coders from Iran in codeforces community, then why the country name is gone?

»
4 years ago, # |
  Vote: I like it +43 Vote: I do not like it

Here's a link to my [SPANISH] screencast + detailed explanation of problems

https://www.youtube.com/watch?v=hntk3GShfik

Fun fact: Problem D was actually almost identical to problem "Median" from IOI 2000, China!

I really liked all the Qualification Round problems this year. Congratulations to all the problem setters!

»
4 years ago, # |
Rev. 2   Vote: I like it +28 Vote: I do not like it

Another simple approach for E which also achieves a 100% accuracy on 25000 local tests:

  1. Estimate player $$$i$$$ skill level ($$$S_i$$$) as the sigmoid inverse of its ratio of correct answers.
  2. Estimate question $$$j$$$ difficulty ($$$Q_j$$$) as the sigmoid inverse of its ratio of failed answers.
  3. Loop through the questions answered by a player: If the probability of a player answering the question correctly is high (e.g., > 0.75) but the player has failed the question, then add $$$S_i - Q_j$$$ to the suspiciousness of player $$$i$$$.
  4. The cheater is the player with the highest suspiciousness.

The code is really simple:

C++ source code

Fun fact: During the round I wrongly estimated players' skill level and questions' difficulty, using a linear function between -3 and 3 instead of the sigmoid inverse. However, this was still enough to pass test set 2 with just 3 wrong cases.

  • »
    »
    4 years ago, # ^ |
    Rev. 2   Vote: I like it +30 Vote: I do not like it

    I see that your approach works very well, but I don't believe the inverse sigmoid accurately estimates the skill/difficulty level of contestants/problems. The inverse sigmoid will calculate the estimated skill of a contestant assuming all questions have difficulty 0, rather than from the uniform distribution [-3,3].

    Instead, i think what should be done, is to integrate across the possibility problem difficulties for each contestant

    $$$p = \frac{1}{6}\int_{-3}^{3}\frac{1}{1 + e^{q-s}}\,dq$$$

    Solving this leads to this formula for the skill level of the contestant given the percent of correct answers $$$p$$$

    $$$s = 3 + ln(\frac{e^{6p} - 1} {e^6 - e^{6p}})$$$

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +10 Vote: I do not like it

      Absolutely. I did the same and obtained 999 correct tests on 1000 random tests with a probability cheating of 50%.
      In order to stress the method, I used a 10% probability of cheating (as suggested by ecnerwala) and obtained 801 correct tests on 1000 random tests.
      I don't see any theoretical justification for using the inverse sigmoid method. This method actually underestimate the real skill by 30% to 40% on my tests.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +8 Vote: I do not like it

      Thanks, you are absolutely right!

      I changed the estimation to use you formula and now skill and difficulty estimations are considerably more accurate. However, using these improved estimations, the approach performs slightly worse overall (98% accuracy).

      I do not understand why the previous version worked so well...

      Just as a note: In the official Google analysis, they also state that skill/difficulty can be estimated by using the sigmoid inverse.

»
4 years ago, # |
  Vote: I like it +3 Vote: I do not like it

When does google code jam usually start? I want to put a reminder so that I don't miss registering for it next year.

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Apparently they are running a poll on twitter to decide design of this year's codejam. People here would be interested in voting.
https://twitter.com/googlecodejam/status/1379150242579410945

P.S. — I didnt liked any available design. :(

»
4 years ago, # |
  Vote: I like it +5 Vote: I do not like it

Why codejam lost so much hype? I see just one small discussion on this site, ~10k participants (2 times lesser than codeforces div2 round), just 54 points was enough to advance.

BTW why so weird point distribution? I thought that C2 was harder than A2 or B3...

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +9 Vote: I do not like it

    Maybe there was some discussion during the contest instead again ;)

    I am curious what was the worst case in B3, i.e. the maximal difference between sum and the answer.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +34 Vote: I do not like it

      Or maybe round 1A got missed by many due to late night/early morning time

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Here are the outputs, including differences between sum and answer, and the maximum difference at the bottom.

      https://paste2.org/BjBcKD2L

      The answer is 3025 (see bottom of file), so they engineered a test case to be at the limit, which makes sense.

»
4 years ago, # |
Rev. 3   Vote: I like it -10 Vote: I do not like it

Round 1A: Reading the editorial for the "Hacked Exam" problem, I think that maybe they overcomplicated the solutions for test sets 1 and 2 (or I'm completely wrong?). I just figured that, with T/F questions, adding a second student with a lower score doesn't change the expected value. So, the answer is to just replicate the best student (or the worst one inverted, for that matter). Does it make sense or I just got lucky to have passed with this idea? Thanks for any help! And please follow the code below.

string inv(string a) {
    string r;
    trav(x, a) {
        if(x == 'T') r += 'F';
        else r += 'T';
    }
    return r;
}

void run_test() {
    int n, q; read(n, q);
    int mx = 0;
    string ans;
    rep(k, n) {
        string a; read(a);
        int s; read(s);
        if(s <= q / 2) a = inv(a), s = q-s;
        if(s > mx) {
            mx = s;
            ans = a;
        }
    }
    cout << ans << " " << mx << "/1" << endl;
}
  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    This solution was mentioned in the editorial as the "insight-based solution".

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Round 1A upsolving screencast with a lot of commentary https://youtu.be/RK-NFGYF-wY

»
4 years ago, # |
Rev. 2   Vote: I like it +10 Vote: I do not like it

Not sure where to ask this (can't find a R1B thread), so I'll just put it here.

Are you allowed to unofficially participate in Round 1B if you qualified off Round 1A already? I.e. will I be able to look at the problems and make submissions during the contest, or is my account barred from participation during contest?

»
4 years ago, # |
  Vote: I like it +1 Vote: I do not like it

I don't know where to ask , that is why I am asking here Does Google run Plag checker in the middle of the contest ? I am asking this because my rank is decreasing , even though I am not submitting any thing .

  • »
    »
    4 years ago, # ^ |
    Rev. 2   Vote: I like it +13 Vote: I do not like it

    No, people above you are submitting their solutions again. The tie is broken by the last successful submission(source).

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +3 Vote: I do not like it

      Okkk , I didn't knew it , Now it looks like I did good by not taking chances for trying to solve hidden test cases .

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it +21 Vote: I do not like it

        I afraid that's not correct.

        Submitting solutions again, increases the penalty only when the new solution got more points than the previous one.

        • »
          »
          »
          »
          »
          4 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          I too think that the rules are as u mentioned.

»
4 years ago, # |
  Vote: I like it +7 Vote: I do not like it

Another day, another lose!

»
4 years ago, # |
  Vote: I like it +57 Vote: I do not like it

From qualifying with a 600ish rank in 1A last year, to not qualifying at all this year, oof

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +27 Vote: I do not like it

    Telegramjam!

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +2 Vote: I do not like it

    May have chance of qualification considering level of plagiarism in previous round.

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +3 Vote: I do not like it

    Ranks were improved last time. So, maybe you can as you're close 1500.

    Btw, does anyone knows by how much ranks were improved and at which rank?

  • »
    »
    4 years ago, # ^ |
    Rev. 2   Vote: I like it +22 Vote: I do not like it

    That's funny given your level.

    Did you really fail all three attempts or skipped two and expected to pass in the remaining one and it didn't work out?

    Just wondering.

»
4 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Can someone help me with this, please

If one of your tickets is strictly closer to c than all other tickets

$$$OR$$$

if both of your tickets are the same distance to c and strictly closer than all other tickets, then you win the raffle.

Isn't the portion after or says the same thing as that of the portion before or?? Or it gives some extra condition to win?Can someone please explain it, I only took the first condition that at least one of my selected ticket should be closer to c than any of the given tickets, and it gave WA, also I saw the test data but can't figure out what went wrong

Thanks

  • »
    »
    4 years ago, # ^ |
    Rev. 2   Vote: I like it +12 Vote: I do not like it

    "all other tickets" includes the other one of your two tickets, so the second portion is needed.

  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I'd point out that the word "strictly" wasn't there in the problem statement initially. It read:

    "If one of your tickets is strictly closer to c than all other tickets or if both of your tickets are the same distance to c and closer than all other tickets, then you win the raffle."

    Notice the missing strictly. I even raised a question about this and got a very general response, they could have just said that the statement was updated :/

»
4 years ago, # |
  Vote: I like it +29 Vote: I do not like it

The emails for claiming your GCJ T-shirts are out if you've qualified to Round 3. The deadline to claim it is June 13 — when I told some friends, it seems it had gone to their spam folder, so it's best to check ASAP since the deadline is this close.

»
4 years ago, # |
  Vote: I like it +9 Vote: I do not like it

I haven't received any mail for GCJ tshirt(checked spam folder as well) eatmore

»
4 years ago, # |
  Vote: I like it -18 Vote: I do not like it

I haven't received any mail for GCJ tshirt(checked spam folder as well) eatmore

»
4 years ago, # |
  Vote: I like it -9 Vote: I do not like it

I haven't received any mail for GCJ tshirt as well. eatmore

»
4 years ago, # |
  Vote: I like it +5 Vote: I do not like it

Is there anyone who has received their Code Jam T-shirt now?

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +16 Vote: I do not like it

    Nope, they didn't even send tracking information even though I confirmed the order about a month ago.