Hello Codeforces,
This is my first blog on Codeforces, quite excited :) (hopefully this won't be the last).
I wanted to share this ML based application that I created, which predicts a user's rating using Linear Regression, based on the following factors:
- Number of Problems Solved
- Average Rating of Problems Solved
- Registration Date
Application Link : CF-Rating-Predictor
Github Link : Github-Repo
This is my first time actually implementing any thing ML Related, sorry in advance in case I wrote something wrong ;-;
Thank you for your Time ^_^. If you have any suggestions for improvements, do let me know :)
It estimates me at 1434 lmao.
It's a bit Inaccurate, when the growth of the user is exponential like in your case or for people with 3000+ rating, will try to make it better by adding more variables and making the dataset larger, currently it's only 5000 users (took me 5 hours to just load 4000 user's data from CF :/ ), let's see how much more accurate I can make it.
it's so over
But I'm kinda confused as to what this is predicting. Is it trained on stuff?
Yeah , trained it on 5000 random users from Codeforces, tried to avoid all the id's which could have potentially been alt and took the variables x = [number of problems solved, avg rating of problem solved, registartion date] and y as the rating of the user, after this , based on the [number of problems solved, avg rating of problem solved, registartion date] in the input, it predicts the rating of the user, in your case, you have solved plently of high rated problems, so it went overboard a bit XD.
Interesting, it seems to be pretty accurate in a lot of cases ($$$\pm 150$$$ or so). Though it gave you an estimated rating in the $1500$s. Very cool stuff.
Yeah, I was also disappointed with the ratinng it predicted for me lmao , that just means I need to solve higher rated problem XD
I actually wonder if it says something about $$$IQ$$$. Like users with lower predicted rating than true rating have higher $$$IQ$$$ and vice versa. Cuz they were able to do worse/better with the same amount of problems solved. The only other thing I've found that has something like this is the graph from this study (which is probably better at predicting $$$IQ$$$ than just flat out rating), but yours accounts for difficulty, so it could potentially be better.
Yeah could be, but there is also the fact that some people use other resources for practice, like 1-2 month ago, I started doing CSES sheet (pretty dope sheet in my opinion, give it a try if you haven't).
I was also thinking of doing something like using clustering to create subgroups based on some factors that could be related to IQ, and then perform the regression for people having similar IQ, this way it would have been a bit more accurate in my opinion, let's see will try this as well if I get the time :) .
Somehow got a 1637 expected rating. Let’s hope I can reach expert this year :)
All the best, hope you reach expert soon :)
Can you try different models besides linear regression? Would be interested in which model is the most accurate.
Yeah will do, will have to learn a bit more ML first XD, but will certainly post a blog again once I do :)
For my friend, who reached master, the predictor gives the expected rating of 1573. Was he really that lucky lol?
Nobody reaches master out of luck
I did not just get called incompetent by an AI (again)
Jokes aside cool project
Thanks :)
Sadly I think it's not that great:(, for tourist it's showing expected 3100 around, even though he got many 1-st ranks.
Yeah well, people with 3000+ rating are too good to be predicted by AIs :(
It predicts me at 2017, but I think partly because it takes the registration date into account. It's better to take the first rated contest date instead of the registration date imo. I started doing contests almost 3 years after registering, so those 3 years means nothing.
I also think it should take the recent contest performances as well. Interested to see how it performs for someone with weird rating graphs like mine.
Yeah that's a great idea , I think it might be better to take number of active days into account, since people take breaks and all, will try this :)
It says my expected rating will be only 2132. It's interesting that I've never reach ~2100 rating before.
Could it estimate future rating? You could use data as of x months ago as predictors and current rating as the target
Yeah, will try that, I even remember a blog where a guy would manually predict everyone's future rating and he ended up being quite accurate XD
If my max is 1648 obviously someday my rating will be 1680 , why do I need a predictor for that?
I got smaller rating than my max lol
Propaganda
How do you pick the factors?
I created a graph for all the possible factors and the 3 factors (number of problem solved,avg rating of problem solved,registration date) had the most linear relation with rating, so I chose them. Before this I had tried with only the number of problems solved , and there was no realtion at all between rating and number of problem solved, in a way proving that quality of problem matters over quantity.
I don't see the point of predicting someone's expected rating, like
Does it motivate someone to grind harder? No
Does it give them insights on how to improve? No
Can they draw concrete conclusions from your output? No
Upto now , I have just made this for fun, found it quite intriguing that's why shared, I could try adding insight, one could be something like telling them what rating problems to solve to prpgress, will try this as well, thank you for your input :)
Is it fun ? Yes.
Sorry guys, I should have been 1700+
In what time users will reach estimated rating? it might make senseful if you could provide in what time users will reach estimated rating!!
It isn't like a future rating predictor, it guesses your current rating without seeing your actual rating based on the number of problems solved , average rating of problems solved and the date you registered, I will try making a model which predicts your future rating, will let you know when I make it :)
ohk.. gotcha!! then its pretty well!!!
I think it would be better to show max rating in actual rating rather than current rating and then calculate error.
Expected 2080 :) Would love to reach that
lol
I got a 500 Internal Server Error when I entered a username that had a negative rating.
hmmm...Interesting!