Zhtluo's blog

By Zhtluo, history, 3 months ago, In English

For years, I have been dissuading people from using random vector in C++ unless it is necessary (like in adjacency lists, where otherwise it is very annoying to implement). The most common response I get is

BuT iT'S sO eAsY tO uSe!

Recently I finally had a prime example of why all these syntactic sugar will give you a sweet tooth. Witness Collect the Coins, a problem from the recent UCup contest.

The code we had in contest got TLE'ed:

Code

However, just doing these three things is enough for AC:

  • Move all data structure variables to global;
  • Move all closures to functions;
  • Last but not the least, stop using random vectors.
Code

Surely everyone will stop using random vectors instead of...

BuT mY tEaMmAtE wIlL hElP mE wItH tHe CoNsTaNtS!

Ugh for the love of...

Full text and comments »

  • Vote: I like it
  • -86
  • Vote: I do not like it

By Zhtluo, history, 3 months ago, In English

One of the most difficult problems in NAQ 2024 is Balatro with a recorded 5 solves. I attempted and failed to solve the problem during the contest, and found that my solution differed quite a bit from the intended solution (and also got TLE'ed). I spent quite a bit of my time just trying to speed up my solution, which I found to be interesting.

https://zhtluo.com/cp/constant-optimization-and-cheesing-through-naq.html

Full text and comments »

  • Vote: I like it
  • +24
  • Vote: I do not like it

By Zhtluo, history, 8 months ago, In English

Screenshot

It turns out that randomly guessing works in Div. 1 too...

I have reached GM thanks to randomly guessing. Ask me anything!

Since I have already argued in theory how randomly guessing helps CF rating, I figured that I might as well compile a list of problems I guessed on my way to GM, to give some empirical evidence to all esteemed grandmasters before my rating falls off. So let us get started!

1975D — Paint the Tree

Just solve the problem manually and claim it is the best solution. Surely it is.

Spoiler

1967C — Fenwick Tree

Have you ever played the game 'find next number in the sequence?'

Spoiler

1965C — Folding Strip

- Just fold whenever you can!

- Wait. What if I get stuck somewhere...

- Well... You can always randomly guess that you will never get stuck! Surely that works. Source: Trust me bro

Spoiler

1951

By now you have seen many problems that can be randomly guessed. But have you seen an entire contest that can be randomly guessed?

Spoiler

I will probably add more to the list...

Full text and comments »

  • Vote: I like it
  • +163
  • Vote: I do not like it

By Zhtluo, history, 11 months ago, In English

I hope my previous blog has convinced you that the best way to improve at Codeforces is to be more Russian, i.e. to improve your math capability. Unfortunately, humble mortals such as you and I are not gifted with the talent that esteemed Russian grandmasters such as 74TrAkToR had:

Surely, it is beneficial to have a code reference for many algorithms and data structures, but I also think that just superficially knowing the algorithm and maybe having implemented it once or twice before is sufficient to use it without a reference?

— Some other Codeforces grandmaster

Therefore, in this blog I will explore the dark side of Russian-ness — randomly guessing — that is forsaken by every Russian grandmaster of the light side I know. However, it has been very helpful to me, and I hope that it serves you well, too.

Example 1

I will start by using an example to demonstrate my thought process. This problem 1923C was from a recent Educational Round.

Intuition

To solve the problem, I first read the statement. My first impression is that the statement kind of asks if you can fudge a subarray a little bit. So it's intuitive that I first imagine some sort of subarray (represented by a plot) in my head.

1

I then check the conditions — I have to change every element a little bit while leaving the sum intact. Intuitively it is pretty easy to change every element a little bit (up 1 or down 1) without affecting the sum too much.

2

Then it seems very hard to reason from here, so I make a guess: Every subarray is fudgeable.

Now I try to find a counterexample to my guess. There are plenty of them, but intuitively bad things happen when you have a bunch of $$$1$$$ s since you cannot lower them down.

3

It would then seem that we need to consider the potential other elements can be lowered down and the necessity of $$$1$$$ s being raised up.

4

Now I make another guess: As long as there is enough potential to lower down things than the number of $$$1$$$ s, the answer is yes.

I then try to find another counterexample. I am not able to find any, so I believe that this is correct (which as you will see is actually not).

Now I need to figure out how to compute the number of $$$1$$$ s and the potential of every subarray. I am Chinese enough, so I dismiss both as trivially doable by some technique.

Now the problem looks solved, so I move to formalization.

Formalization

In this step I have to actually write down math. Obviously the necessity of raising is dictated by the number of $$$1$$$ s, and trivially maintained by some prefix sum cnt[] on counting $$$1$$$ s. The potential seems to be the sum minus the length, and also trivially maintained by prefix sum sum[].

Now the problem seems to be codeable, so I move to implementation.

Implementation

This step is mostly just coding and debugging. I first code this:

Code

Then it WAs on the sample... Fortunately by looking at the sample I see that I missed a trivial edge case where the length of the subarray is $$$1$$$ . This is quickly fixable:

      if (s - (r - l + 1) >= c && (r - l + 1) > 1)

Then it ACs. The whole process takes no more than 15 minutes.

Review

So what happened in the example? In this blog I will mainly talk about the intuition process. As you can see from the example above, I start my intuition by imagining the problem in my head with something I am familiar with (a plot). Because this all happened in my head, you will see that there is almost no formula in my head. Or, as I would like to put it,

Formalization is the death of intuition; don't use formula in the intuition part if you can.

Then, there are roughly three methods I can use to solve the problem.

  1. Dismiss. I can say I know how to do this with some technique (i.e. Chinese-ness) and throw it out.

  2. Reason. I can take a logical step forward, relatively confident that I am correct.

  3. Guess. I can randomly guess the most convenient thing that helps solve the problem. This is followed by trying to find a counterexample in some amount of time. If none is found, I believe it is correct.

Now I will demonstrate this process on another problem: 1923D, this time without pictures.

Example 2

Now this problem is about slimes, so it makes sense to imagine a column of balls in my head.

Intuition

Now I first reason that if some slime is eaten, it must be eaten either by a left giant ball or a right giant ball. Since it is symmetric I will consider the left case.

Now I reason that the left giant ball surely is made by some interval of slimes, whose sum is larger than this slime.

Then I get stuck, so I guess that any interval greater than this slime is plausible.

I try to find a counterexample. I end up finding one, which happens when everyone is the same size so no one can eat anyone else.

I then guess that that any interval greater than this slime and not all the same is plausible.

I try to find a counterexample. I cannot find anyone, so I believe it is true.

Now for some slime I need to know the shortest left interval that is not all the same and has sum greater than this slime. I dismiss that both are trivially binary-searchable. Now the problem looks solved to me.

Formalization

I mainly need to figure out how to test if an interval is of the same number. There are multiple ways to do it, but intuitively we can run a prefix sum on A[i] == A[i - 1]. Now the problem looks codeable to me.

Implementation

Now just start coding...

Code

Apparently this code got WA on test 2. Debugging this is a pretty interesting American task, but I will skip this part (as it is unrelated to intuition) and just say that the issue is that in binary search does not adequately cover the case where the neighboring one is directly larger.

      if ((i - 1 >= 0 && A[i - 1] > A[i]) || (i + 1 < N && A[i + 1] > A[i])) {
        ans1 = ans2 = 1;
      }

Fixing this is enough to AC.

The Elephant in the Room — Why You Should Guess without Proving

Now that all esteemed Russian grandmasters should have already got very angry, claimed that this is 'too young, too simple,' downvoted this blog, and left, I am going to address the most important issue: why you should learn to guess without proving anything.

A Counterexample: Why Proving is Bad

Let us use the previous slime example. Say we want to actually prove the guess. How should we start?

We can try to construct a eating sequence on the interval. To do that, we need to guess that the some maximum slime with size $$$m$$$ in the interval can eat everyone else. But if two neighbouring slimes are both maximum, then they cannot eat each other. So we need to guess that:

If the slime interval with $$$m$$$ as the maximum element has at least two distinct elements, then there exists some neighbouring slime pairs $$$(a_i,a_{i+1})$$$, such that one of the slime is $$$m$$$ and the other is not $$$m$$$.

It is not very obvious how to prove this. So we need to flip the statement to its contrapositive:

If for the slime interval with $$$m$$$ as the maximum element, there does not exist any neighbouring slime pairs $$$(a_i,a_{i+1})$$$ such that one of the slime is $$$m$$$ and the other is not $$$m$$$, then the interval has only one distinct element.

We can then use induction to prove this statement.

For an interval of length $$$1$$$, this is obviously true.

Assume the statement is true for any interval of length $$$n$$$. For an interval of length $$$n+1$$$, observe that for the prefix of $$$n$$$ elements, there is only one distinct element $$$m$$$. Therefore, since there does not exist any special slime pair, the last element must also be $$$m$$$.

This concludes the induction proof.

If you followed the proof, you should notice that:

  • We spent a lot of effort on guessing things not related to implementation at all.
  • We used proof techniques such as induction which is totally unnecessary if we just want to solve this problem.

I would also assert that it probably takes more effort to think and write this proof than to just code the solution out, which motivates me to develop a theory on guessing.

Theory on Guessing

I begin by claiming that proving is very inefficient:

(Thesis of Proof-by-AC) Consider that for some problem you have a solution that takes $$$a$$$ time to code with probability $$$p$$$ of being correct. Assume you can find a proof for this solution in $$$b$$$ time, you should only do so if $$$b<\frac{1-p}{p}a$$$.

Unfortunately, I cannot say I guessed this thesis and actually have to prove it. So here is the proof.

Let us consider the expected efficiency of problem solving, defined as (expected problems solved) / (time taken). Observe that we want to maximize this efficiency.

If we code this problem, then our expected efficiency is $$$\frac{p}{a}$$$.

If we try to prove it, the best case is that after $$$b$$$ time we have an absolutely correct solution (since finding out our solution is wrong only lowers the efficiency). In this case, our expected efficiency is $$$\frac{1}{a+b}$$$.

A trivial computation gives the thesis, which is left for the Russian grandmasters who have not left yet as a practice.

We can plug in some number to demonstrate the inefficiency of proving. If you have some solution that works 50% of the time, according to the thesis you should only prove it if proving it takes less time than coding it. Now since you have to guess it rather than reason it, chances are you are not going to figure out the proof in $$$a$$$ time, so you probably should just go coding.

I think the main takeaway for the thesis of Proof-by-AC is that we don't need our solution to be correct; we only need a probabilistically correct solution to increase our efficiency (and subsequently rating). We will see in the following sections that we can loosen this statement even further.

Some Theory on Reasoning

Apparently in philosophy some people have already discussed about reasoning vs. guessing. In their theory, logic is categorized into three parts.

  1. Deduction. This is the most rigorous form of logic, the one you will encounter the most in model solutions.

  2. Induction. This means to make a conclusion based on a body of observations. In CP this means to brute-force small cases and do some pattern recognition. Fortunately, esteemed Russian grandmasters (maybe with the exception of 74TrAkToR) does not like this line of reasoning too much, so you will not face a lot of brute-force pattern-finding problems. Just so you know, ChatGPT is also pretty good at writing a brute-force.

  3. Abduction, a.k.a. randomly guessing. This is the randomly guessing we are looking at. It means to seek the simplest and most likely conclusion from a set of observations.

So what I am going to show you next is that abduction is not that bad, especially in a Codeforces setting.

Some Learning Theory

Fortunately, in machine learning people are already dealing with estimating the correctness of a model in real life when it is correct on all training data, which is exactly the same as we are trying to estimate the correctness of a guess in system tests when it is correct on all examples we can think of. Allow me to introduce the concept of PAC learning:

(Claim of PAC Learning, from UPenn): The probability that there exists a hypothesis $$$h \in H$$$ that is consistent with $$$m$$$ examples and satisfies $$$\textrm{Error}(h) > \epsilon$$$ is less than $$$|H|(1 − \epsilon)^m$$$.

Or, in layman's words:

(Thesis of PAC Abduction): The simpler the hypothesis, the more examples we find, the more likely the hypothesis is correct on most tests.

Now we can assume that we draw examples from a somewhat similar distribution as test cases are generated. We can also assume that any problem on Codeforces has no more than 1000 test cases. This seems to imply that as long as we make sure that $$$\textrm{Error}(h) < \frac{1}{1000}$$$ we have a pretty good chance of passing all test cases (in fact, more than $$$\frac{1}{e}$$$). In other words, we only need to make sure that our guess is probabilistically approximately correct (i.e. PAC).

Example 3

I want to add a final example on 1930C, a funny problem I did recently, to demonstrate the full might of PAC abduction, a.k.a. randomly guessing.

Naively if we never take out the same number, we should just greedily take numbers on a right-to-left order, but what happens when we take out numbers that are the same?

Me, after looking through the samples, guessed: Surely we can just get that number minus 1.

My code:

Code

The funny part comes after the contest.

Another Russian Master: I don't understand why your C works.
Me: IDK too.
Another Russian Master: What?
Me: What?

Common Pitfalls

There are some common pitfalls when using the way of thinking. I will list a few that I encountered.

  • Dismissing too quickly. When you dismiss a thing, you have to actually figure out how to do it when you start formalizing. Sometimes I dismiss things too fast and find that I don't actually know how to do it. Fortunately this usually just results in more thinking, but it is important to be aware of your Chinese skill level.

  • Not knowing what to guess. This happens usually because you are not familiar with common things that you can guess. One way to deal with it is to always guess the most convenient thing (like the three examples above). The other way is that when you are reading the tutorial, ask yourself: what guess can I come up with that solves the problem? Or rather, as I would put it: In an upsolve, ask yourself how you could have solve the problem in the contest. If you cannot see a plausible way to solve the problem in contest then you cannot solve it in contest.

  • Guessed a wrong thing and didn't find the counterexample. This is the price we pay for guessing, though you should ask yourself if the counterexample is trivial to find. If it is something easy (e.g. all elements being the same), then you should try to think of more examples in the upcoming contests.

FAQ

  • I guessed the solution and everyone around me called me a cheater!
    The dark side of the Russian-ness is a pathway to many abilities some consider to be unnatural.

  • But guessing randomly is not fun!
    I don't think it's fun, either. Unfortunately, given that Codeforces only consists of problems with high Russian-ness, guessing is the fastest way to improve in my opinion. Maybe go ask Mike to host problems of more Chinese-ness or American-ness so that we aren't forced to guess through the contest.

I will end the blog with the code of randomly guessing:

Proof is a lie; there is only AC.
Through guessing, I gain code.
Through code, I gain AC.
Through AC, my ratings are broken.
The Russian-ness shall free me.

— Darth Luo, probably

Experiment

Question: Is communism science or art?
Answer: Of course it is art. If it was science, they would have experimented on rats first!

— Some Soviet grandmaster, probably

I am actually curious how much randomly guessing can help people in Div. 3 and Div. 2. If you are convinced by me, maybe you can try it out on some problems around your rating and tell me your experience in the comments! This will help me to understand its efficacy. Thanks!

Full text and comments »

  • Vote: I like it
  • +431
  • Vote: I do not like it

By Zhtluo, history, 11 months ago, In English

If you are triggered by this clickbaity blog title, you are probably interested in improving your Codeforces skills. Now, I will share my point of view on the differences between Codeforces and ICPC contests, and how you can, in my humble opinion, maximize your Codeforces rating gain.

I mostly consider that all conceivable competitive programming problems need three aspects of skill:

  1. Observation — the ability to understand the problem and come up with non-trivial properties.

  2. Technique — the ability to apply a well-known algorithm or data structure to the problem.

  3. Implementation — the ability to code fast and debug fast.

Collaterally, I refer to these three skills as Russian-ness, Chinese-ness and American-ness, respectively, for reasons you will soon see below.

Observation (Russian-ness)

Observation means that you stare at some problem for a sufficient amount of time and you are able to reduce it to some easier problem. One good problem that I think is interesting is this 1889D. Despite being rated 3000, the problem is solvable in 30 lines of trivial code. Of course, there are also bad problems, such as the infamous 1916D and 1916H1, but the gist is that the problem needs good mathematical thinking and nothing else.

I consider this aspect of difficulty to be Russian because this type of problem happens more often in Russian ICPC-style contests, specifically OpenCup and ICPC camp, etc. They are also single-handedly the most popular problem style on Codeforces and AtCoder.

Now the interesting part about it is that since the skill level correlates to math skill and nothing else, you can literally do IMO problems instead of Codeforces problems. For instance, if you take this problem list from IMO2022 and look at the combinatorics problems in it, you will see that they translate nicely to Codeforces Div. 1 problems. There is a saying that math majors have an advantage over CS majors in Codeforces, and I think it is justified to some extent.

Technique (Chinese-ness)

Technique is the more common thing you will learn at any competitive programming course, such as binary exponentiation, segment tree, etc. Obviously, it is not fully possible to distinguish techniques from observation, but I draw the line as to whether it is possible to produce a reference code for the thing in question. So dynamic programming and greedy are more Russian, and Fenwick tree and segment tree are more Chinese.

Now I call these problems Chinese because Chinese OI and ICPC have a tendency to produce problems that are solvable only by advanced data structures that are only possible if you have a reference code. For instance, 104857C is solvable only with a palindrome tree, and 104857A is solvable only if you took some graduate course.

I guess this is the most learn-able part of all, and what most learners of competitive programming will engage with. Now we get the intuition why it is so hard to improve at Codeforces: you are trying so hard to become Chinese instead of becoming Russian!

Implementation (American-ness)

Implementation mostly refers to problems that take a lot of time to implement, such as geometry problems that are very popular in US contests, such as 104757C from the recent ECNA. I mostly consider that there are two difficulties in implementation. One is that the code is just long but is not particularly tricky, such as the famous Rubik's cube; the other is that the code consists of many cases and is hard to get right, such as 104869I.

Now I call these problems American because geo problems usually require a code reference and are pretty popular in US regionals.

Summary

Now, where does Codeforces stand in all of this? I will make one bold claim here: for Codeforces now, you only need Leetcode medium + high school math contest to get to blue. I will try to argue for it in these two claims.

Claim 1: You only need to finish Div. 2 A, B, C in one hour to get to blue.

This is trivially true by looking at past contests.

Claim 2: Div. 2 A, B, and C require nothing but basic programming skills and good math skills.

This is more debatable, but I think none of them require any noticeable techniques other that mathematical analysis. Therefore, being good at middle and high school math contests suffices.

Now we can answer the golden question of all time.

How do I improve at Codeforces (for now)?

A lot of people in Codeforces Catalog will just say that you should do more problems. That is obviously true, but it reminds me of this interesting joke:

Why learn swimming? Just jump into the river. It works for everyone who lives after that!

I think, ultimately, this line of thought just loses people out of competitive programming for thinking that they are not clever enough. I hope this blog convinces you that Codeforces for people below Div. 1 has nothing to do with programming but everything to do with math. Therefore, to study for Codeforces you should not study programming but instead study for math contests!

Why haven't I become better at Codeforces after reading through cp-algorithms?

Because Codeforces has nothing to do with competitive programming algorithms for people below Div. 1. Just study math problems instead!

Should Codeforces be so 'Russian'?

Now, obviously, it's not for my humble mortal mind to ever contemplate the masterful design of Russian grandmasters such as 74TrAkToR. But I think, in general, when a problem that has a higher Chinese difficulty is proposed, the response is

This problem seems too standard (translate: is not Russian enough). It would be a good one for an educational round, but not a regular one.

As for American-ness, well...

When was the last time you saw a geo problem in a Codeforces contest?

Jokes aside, it is beyond me to say what should be more important in competitive programming contests, but I do think the community should reflect on this interesting difference between Codeforces and ICPC.

Alright, now, before I got nuked by the three largest countries of the world, canceled by the greatest grandmasters on Codeforces, and downvoted to oblivion, I think it is high time that I bail out. I will end with this sentence.

May the Russian-ness be with you.

Full text and comments »

  • Vote: I like it
  • +772
  • Vote: I do not like it

By Zhtluo, history, 11 months ago, In English

I was trying to put this problem (https://codeforces.net/gym/104976/problem/A) into a mash-up and then override the statement. Unfortunately, Codeforces cannot render the override correctly and produces

No
(begin{example}) or
() sections or more the one such section(s).

instead.

I assume the problem is that the statement is a PDF..

Full text and comments »

  • Vote: I like it
  • +8
  • Vote: I do not like it

By Zhtluo, history, 13 months ago, In English

So I was studying this infamous ICPC ECNA 2023 geo problem (https://codeforces.net/gym/104757/problem/C)...

I tried to read the judge's solutions here (https://raw.githubusercontent.com/icpc/na-ecna-archive/main/2023-2024/convexhullextension.zip) from this site (https://na.icpc.global/2022-23/regionals/ecna/ecna-archive-2022-23/) but they don't really make sense to me.

So I came up with this test case:

5
-999 999
-998 -998
999 -999
998 999
0 1000

And I tried to run it on the solutions. It turns out that the solutions don't agree with each other...

So now I am stumped. What is the correct solution for this problem? Does it even exist?

Full text and comments »

  • Vote: I like it
  • +182
  • Vote: I do not like it

By Zhtluo, history, 13 months ago, In English

Maybe next time we should not make ChatGPT-able problems or OEIS-able problems. :)

Full text and comments »

  • Vote: I like it
  • +364
  • Vote: I do not like it

By Zhtluo, history, 13 months ago, In English

So apparently after ML problems we are getting crypto problems in ICPC...

Link: https://zhtluo.com/cp/lll-yet-another-paper-reading-problem.html

Full text and comments »

  • Vote: I like it
  • +7
  • Vote: I do not like it

By Zhtluo, history, 14 months ago, In English

In hindsight, I probably should not have spent this much time on a wrong algorithm.

Allow me to present Constant Optimization on Binary Exponentiation and Matrix Multiplication.

Full text and comments »

  • Vote: I like it
  • +94
  • Vote: I do not like it

By Zhtluo, history, 18 months ago, In English

Consider this famous necklace counting problem:

Assume a necklace can be made with $$$n \ (1\le n\le 10^5) $$$ beads. Each bead can be painted with one of the $$$ k\ (1\le k\le 10^5) $$$ colors. We consider two necklaces the same if after rotating and flipping they look identical. How many different necklaces are there in total?

The solution to this problem is often represented in terms of group theory jargon. In this post we will try to explain and demystify the process.

Just wrote a short tutorial on Polya's Theorem as I tried to figure out how to explain it to people without background on group theory. I am not exactly a mathematician so there might be errors and inaccuracies. Any feedback and suggestion would be greatly appreciated. Enjoy :)

https://zhtluo.com/cp/from-burnside-to-polya-a-short-introduction-to-group-theory.html

Full text and comments »

  • Vote: I like it
  • +39
  • Vote: I do not like it

By Zhtluo, history, 2 years ago, In English

Are you scared when you hear about all these pesky data structures like Fenwick Tree, RMQ, Segment Tree in competitive programming?

Are you afraid of writing code to solve problems like finding the minimum, maximum, or sum of some range query, especially when there are updates to the data?

Well, fear no more! In this tutorial I will introduce the Swiss knife of all sequence manipulation data structure, one code that can (theoretically) solve every problem of this kind, one tree to rule them all — the Splay Tree!

So this is a tutorial I wrote on the Splay Tree:

https://zhtluo.com/cp/splay-tree-one-tree-to-rule-them-all.html

Feedback and additional practice problems are welcomed. Enjoy :)

Full text and comments »

  • Vote: I like it
  • +251
  • Vote: I do not like it