Have you been wondering what is the difficulty of Code Jam problems on a codeforces scale? me too.
I make a simple estimate for the 2021 qualification round (link), and I plan to do it also for the upcoming rounds. I share here the process, the results, and I welcome any feedback.
Data Cleaning
I use two data sources: - CJ contest result data, downloaded using vstrimaitis code, see details in his great blog post. From here I get the list of contest participants and what problems they solved.
- CF users data, downloaded using CF API. From this, I get the current rating of every CF coder.
I assume that many coders use the same username across different platforms. If for a given CJ contestant I find a CF user with the same name (case insensitive), I assume they are the same person. I assign to each CJ participant the rating of the corresponding CF coder, and I discard all other participants.
Difficulty Estimation
The formula used by CF to determine the difficulty of a problem is not public. However, the main idea is that you have a 50% probability of solving any problem with difficulty equal to your rating. Some details here. So I divide contestants into buckets wide 100 rating points (a 1450 and a 1549 coders fall in the 1500 bucket), and I see what bucket had a 50% rate of success. That's my estimate of the difficulty of the problem. I group together all ratings above 3000 and below 500, or the sample size would be way too small.
Results
Out of the 37398 contestants who submitted something during the qualification round, 11109 have a homonym CF user. Here their success rate on the different problems:
Estimated Difficulty: A <=500 B1 <= 500 B2 <= 500 B3 2000 C1 600 C2 1400 D1 2400 D2 2700 E1 2600 E2 3000
Estimation issues
matching profiles across platforms using the username is a bold assumption. I am discarding many coders, for example tourist, who competes as Gennady.Korotkevich in gcj. And, even worse for the estimate, I am probably matching some profiles that correspond to different persons.
this was a qualification round where you just needed to score a minimum number of points to pass, with little incentive to do more. Many strong contestants didn't seem to care about solving all the problems. See LHiC for example, who just solved problems E1 and E2. This lowers the problem success rate, and inflates the difficulty.
Any thought?