Всем привет,
Мы с друзьями пытаемся научить машины решать задачки самостоятельно.
Для этого нам нужен очень большой набор пар (условие задачи => решение).
Мы уже скраулили почти все сайты на которых решения публичны. Теперь мы пытаемся собрать решения с сайтов, где решения закрыты.
Если вы прорешивали Timus, UVa или любую другую платформу, где решения закрыты, и не против дать мне доступ скраулить ваши решения, пожалуйста пришлите в личку логины/пароли акаунтов в этих системах.
Мы нигде не будем публиковать код, и не будем использовать акаунты как-либо кроме как для того чтобы скачать решения.
И напоминаю, что мы поддерживаем платформу, на которой просим людей переписывать условия задач в коротком виде. Мы платим за это, многие участники получают около $12/час. Платформа находится по ссылке:
Для тех, кто готовится к ICPC и как следствие не может работать полный рабочий день это может быть очень удобно.
That's a very bold thing to do, wish you luck, but I (and many other people) think that science is not ready for such thing. But fortune favours the brave.
Yes, people are divided into those who think it's a decade away and those who think it is around the corner. I belong to the latter group :)
Both me and a friend of mine have left our day time jobs to work on this full time, so we are quite invested in our belief :)
Does UVa even store solutions?
You are right, apparently it doesn't.
We haven't found any UVa accounts yet, so it didn't come up before.
I just registered in the platform and I see that we have to win points on it. So more points means more payment or its just a sort of recruitment test?
It's directly proportional to the payment, 20K points = $20
Возможно я немного луддит, но я не очень понимаю стремления автоматизировать относительно интересные вещи, когда все еще не автоматизировано много неинтересных. Мне не очень хочется жить в будущем, в котором ИИ будет решать задачи на кф, но при этом полы в макдаке по-прежнему будет мыть уборщица.
С другой стороны, учить машину мыть полы не очень интересно, а решать задачки — интересно
Кажется и там и там машинное обучение, при этом на роботах которые хорошо моют полы имхо можно больше заработать. Ну может мыть полы немного посложнее научить, не знаю :trollface
Когда мы решим программирование, автоматизировать уборщицу будет плевое дело.
Наоборот не работает.
Do you crawl only the AC solutions? Have you crawled on platforms like SPOJ? They don't seem to have public solutions.
Yes, SPOJ doesn't have public solutions, so we need access to individual people accounts to crawl them.
You may ask SPOJ folks about possible access to solutions to you for research purpose. They might allow it. I have set around 40 problems on SPOJ. I will try to ask them whether I am allowed to crawl/store those submissions.
I am working on a messenger bot. Please send me your facebook account's details (e-mail, password, etc), I won't publish them.
Many solutions to UvA problems are available on github, so you could crawl that for solutions, though I don't know how you could verify them for ACness.
Anyway, good luck with your research.
Why not share a script that allows you to scrape your AC solutions from the respective judges and upload them to your website instead of asking users to share their passwords, something that most people likely won't do?
I wanted originally to publish the crawlers and do it the way you described, but in year 2017 it's very hard to distribute code that is supposed to be ran locally. People have very different setups.
Out of curiosity, what would stop you from sharing your Timus account?
A possible way that this could harm a user is this:
Let's assume that somebody has similar passwords for e.g. VK and Timus. If they don't change their password on Timus before sharing it, they are opening up a potential attack vector for their other accounts if your database gets hacked.
This is a valid point.
However, anyone who would spend effort to download a crawler and run it on their machine would probably also be willing to spend time to change their password.
Would you mind changing your password on Timus and sharing your account with me? :)
curl | sh
is a really easy distribution method (even though still unsafe) for pretty much any programmer running a Unix-like OS, andiex (New-Object System.Net.WebClient).DownloadString('http://domain/script.ps1')
is a similar alternative for users using Windows, so you actually don't have to spend much effort to download a crawler.Assuming that your crawler doesn't have many dependencies, this should work immediately.
I personally wound be way more hesitant to run
curl | sh
than to share my Timus account :)Also, on the
curl | sh
:https://www.idontplaydarts.com/2016/04/detecting-curl-pipe-bash-server-side/
There is a small suggestion from my side here. If at all, at some point during your research you come to the conclusion that the task at hand seems intractable, you could consider a relatively easier problem of predicting tags for a question. Tags could be like "Segment trees", "DP" ,"Math" etc. and the features could be the constraints along with any other information you could extract using NLP.
Are you using NLP to solve the CP problems?? However, that's a great breakthrough man!! Keep working on it! :)
Is the website still working? I keep getting 504 Gateway Time-out.