Thoughts on DeepSeek R1?? - Codeforces

→ Pay attention

Before contest
Codeforces Round 1000 (Div. 2)
10:00:37
Register now »

*has extra registration

→ Top rated

#	User	Rating
1	jiangly	4039
2	tourist	3841
3	jqdai0815	3682
4	ksun48	3590
5	ecnerwala	3542
6	Benq	3535
7	orzdevinwang	3526
8	gamegame	3477
9	heuristica	3357
10	Radewoosh	3355

Countries | Cities | Organizations

→ Top contributors

#	User	Contrib.
1	cry	168
2	-is-this-fft-	165
3	atcoder_official	160
3	Um_nik	160
5	djm03178	157
6	Dominater069	156
7	adamant	153
8	luogu_official	152
9	awoo	151
10	TheScrasse	147

View all →

→ Find user

→ Recent actions

Detailed →

Rogue_Ronin's blog

Thoughts on DeepSeek R1??

By Rogue_Ronin, history, 11 hours ago, In English

In English

So DeepSeek launched their R1 model which they claim is on par with OpenAI-O1.

More details here: Link to X post

This is a big update for CP as the model is open sourced and the chat can be accessed for free. As for the R1 model itself, I think its great particularly the chain of thought feels like that of a human.

I tested it out on 6 different problems in total, 3 problems of 1700 rating, 1 problem of 1800, 1 of 1900 and one unrated problem from recent contest (Div. 2 — 996).

This is what happened:

Problem 1 — It managed to solve this in the first attempt.
Problem 2 — Again managed to solve in the first attempt.
Problem 3 — For this problem, it took an extra prompt.
Problem 4 — Unable to solve after 5 attempts.
Problem 5 — Unable to solve after 4 attempts.
Problem 6 — Unable to solve after 4 attempts.

Was curious to know your thoughts on this model, is this something that contests would be affected by in the near future?

+13

Rogue_Ronin
11 hours ago
4

Comments

Comments (4)

Write comment?

»

11 hours ago, # |

Vote: I like it

0

Vote: I do not like it

Auto comment: topic has been updated by Rogue_Ronin (previous revision, new revision, compare).

→ Reply

»

8 hours ago, # |

← Rev. 3 →

Vote: I like it

+11

Vote: I do not like it

I notice that the problems that it can solve from your data set are a couple of years old, meaning there's a good chance that the model has those problems in it's training data

Testing it on some newer problems (single shot):

Photoshoot For Gorillas (o1 solvable): 302277443 | (AC) (1400)
Paint a Strip (not o1 solvable): 302278718 | (WA on test 1) (1200): It admitted in its COT That it couldn't figure it out, funnily enough
Penchick and Desert Rabbit (idk if o1 can solve it): 302279772 | (WA on test 1) (1700): It was confident in its answer this time looking at the COT

It looks impressive, and it's nice to see the "behind the scenes" of the COT, but it also has the same flaws as GPT o1, and since o1 is already readily available (paid through ChatGPT, free but janky through Github Models), I'm not sure if this will affect the cheating epidemic to a large degree. It's nice that it's more openly available though, and the ease of access could make it easier for problem setters to test their problems on.

An additional note, for Paint a Strip and Penchick and Desert Rabbit, it took 4 — 5 minutes to respond

EDIT: Tested it on 2 problems from the USACO December 2024 contest, could solve It's Mooing Time from Bronze, but not Cake Game from Silver (which was simple enough to be a bronze question)

→ Reply

»

6 hours ago, # |

Vote: I like it

0

Vote: I do not like it

It outputs much shorter code than o1.

→ Reply

»

5 hours ago, # |

Vote: I like it

0

Vote: I do not like it

Couldn't solve this (Gave brute force)

→ Reply