Pa_sha's blog

By Pa_sha, history, 4 weeks ago, In English

Motivation

I saw that many begginers don't really understand how to compute complexity and make lots of some mistakes in calculations. Even more skilled programers can have mistakes in this topic. I will write more intuitive way to understant time and memory complexity and this will be the main focus of this blog. There will also be more formal way, but in the spoiler.

What is complexity

In simple words, complexity is just number of operations that the program make. For example, if you have a cycle from 1 to 1000 with step 1, it is 1000 operations and if there you make one addition and one multiplication, it is already 3000 (since you make this two operations 1000 times). For memory it works in the same way. If you make an array of 100 elements, it has memory complexity 100. But in fact, it is hard to calculate complexities up to 1 in general case, while it is also not really needed, since compute makes 1 operation really fast. In the same way, 10 operations or 1000 operations. In fact, when we calculate complexities we do not look at constant. It means, that n operations (where n is some variable and has some numeric value) same as 2*n operations, or n+100 operations is same as n operations. That is same as when we made 1647859 operation, we can just say that we made $$$10^6$$$ operations which is more easier to work with. The reason for it is that it is easier to calculate complexities, since you do not need to look for each operation. Also, since programs use variables which have some range, we will use variables as well.

Types of compexities

  • $$$O$$$. This notation is used when we want to tell that our program make at most this complexity. It means, that if your program have one loop which takes n operations, it has complexity $$$O(n)$$$, but also has complexity $$$O(n^2)$$$. So, if we say that real number of operations that program makes will be a and number of operations that complexity says is b, then $$$a\le b$$$.

  • $$$o$$$. This notation is used when we want to tell that our program make less then this complexity. It means, that if your program have one loop which takes n operations, it can have complexity $$$o(n^2)$$$, but also can have complexity $$$O(n^9)$$$. So, if we say that real number of operations that program makes will be a and number of operations that complexity says is b, then $$$a<b$$$.

  • $$$\Theta$$$. This notation is used when we want to tell that our program make same number of operations as this complexity. It means, that if your program have one loop which takes n operations, it has complexity $$$\Theta (n)$$$. So, if we say that real number of operations that program makes will be a and number of operations that complexity says is b, then $$$a=b$$$.

  • $$$\Omega$$$. This notation is used when we want to tell that our program make at least the same number of operations as complexity. It means, that if your program have one loop which takes n operations, it has complexity $$$\Omega (n)$$$ or it is also $$$\Omega (1)$$$. So, if we say that real number of operations that program makes will be a and number of operations that complexity says is b, then $$$a\ge b$$$.

  • $$$\omega$$$. This notation is used when we want to tell that our program make more operations then complexity. It means, that if your program have one loop which takes n operations, it has complexity $$$\omega (1)$$$ or it is also $$$\omega (log(n))$$$, but not $$$\omega(n)$$$. So, if we say that real number of operations that program makes will be a and number of operations that complexity says is b, then $$$a> b$$$.

Formal definition (not needed for understanding)

In the remaining blog I will use only big O notation, since it is, in my opinion, the most important and popular.

Evaluating complexities

There are some rules for evaluating complexities and here are some of them:

  • $$$O(C)=O(1)$$$ for constant C and variables x and y.

  • $$$O(x)+O(C)=O(x)$$$ for constant C and variables x and y.

  • $$$O(x)+O(y)=O(x+y)=O(max(x,y))$$$ for variables x and y.

  • $$$O(x)\cdot O(y)=O(x\cdot y)$$$ for variables x and y.

  • $$$O(x)+O(x)=O(x)$$$ for variable x.

Example 1
Example 2
Example 3

As we see, computing complexity is same as just solving some math inequality. This is why we need to memorize some cases like harmonic series, to not have wrong thoughts about the complexity.

More advanced examples

There is popular algorithm named divide and conquerer. Using this algorithm, it is hard to analyze the code complexity. For example,

Code 1

One can note that it is just binary search algorithm. But in case we would not have if condition but go recursevily into both find fuunction, it seams that algorithm will work with really big complexity, but in fact it will work in $$$O(n)$$$. To analyze this code, we need additional notation. It is, $$$T(n)$$$. Really similar to $$$O(n)$$$ and in fact it is the same. It is needed to not have confusion when solving recursive time complexities. For example, we can denote first find function calling (from main) that it is $$$T(n)$$$, since we are looking for a value on $$$n$$$ other values. Then, we make number of values half of the number it was and just one additional operation to know which value to go. So, $$$T(n)=T(\frac{n}{2})+c$$$. Here, c is some constant. We will use it but not $$$O(1)$$$ and we will see why in a second. We can try to solve this just by iterating recursion. What I mean is that if $$$T(n)=T(\frac{n}{2})+c$$$, then $$$T(\frac{n}{2})=T(\frac{n}{4})+c$$$ so $$$T(n)=T(\frac{n}{4})+2\cdot c$$$. Using this, we can see that we can performe such operation $$$log(n)$$$ times, so $$$T(n)=T(\frac{n}{2^{log(n)}})+log(n)\cdot c$$$ and since $$$T(1)=1$$$, $$$T(n)=log(n)\cdot c=O(log(n))$$$. Here is why we used constant but not $$$O(1)$$$. It is because when we apply recursion, we get $$$2\cdot c$$$, which is also $$$O(1)$$$. In other words, it would be the same mistake as it was in the third example of previous section. Now, let's look at this code:

Code 2

One can see that it just finds sum of whole array, so it is working linearly, i.e. $$$O(n)$$$, but we will get to it. So, here $$$T(n)=2\cdot T(\frac{n}{2})+c$$$ since we also make constant number of operations but go in recurssions two times but not 1 as in previous example. Also note, that $$$T(1)=1$$$ since if there is only 1 element we return it. So, if we apply recusion $$$T(n)=2\cdot T(\frac{n}{2})+c=4\cdot T(\frac{n}{4})+2\cdot c=2^x\cdot T(\frac{n}{2^x})+x\cdot c$$$. We end our recurssion when $$$n=1$$$, i.e. $$$x=log(n)$$$, so $$$T(n)=n+log(n)\cdot c=O(n)$$$. So, it works in linear time. We can also say that each element was checked one time and since number of elements is linear, code also works linear. This intuition is good enough, but sometimes it can be bad.

Here are some tasks for you if you want to practice with it:

Spoiler

Also, there exists Master theorem, which basicly solve some instances of the problem. There are not all of them, but in practice that is enough. Also, there is geeralization of it which is almost useless in practice, hard to understand but cool in some sence. So, here is master theorem. It is solving all recurences which looks like this: $$$T(n)=a\cdot T(\frac{n}{b})+c\cdot n^d$$$ for any a, b, d. Let's denote $$$p=\frac{a}{b^d}$$$. Then, it just looks at three cases:

  • $$$p<1$$$: $$$T(n)=O(n^d)$$$.

  • $$$p=1$$$: $$$T(n)=O(n^d\cdot log(n))$$$

  • $$$p>1$$$: $$$T(n)=O(n^{log_{b}a})$$$

Another cases

By now, we have talked only about precise algorithms, while in CP it is normal to use random or some tricks that make it hard to evaluate time complexity. For example, what if we write binary search, but we will choose random element but not middle.

Code 1

Here talking about bit O notation is not correct, since number of operations depends on random. It is true that we can say here that code is working in $$$O(n)$$$ and we will be right since $$$log(n)\in O(n)$$$ and $$$n\in O(n)$$$ for example. But in cases when it is optimal to use random, we need it to be faster then trivial bound and in general in such cases data generated randomly depending on the way random is using. So, it is better to look at expectation value. So, let's denote $$$T(n)$$$ as complexity for $$$n$$$ elements. $$$T(n)=\frac{1}{n}(T(1)+T(2)+\dots+T(n))+1$$$ since probability of choosing $$$l+i$$$ point and cut segment to length i using this is $$$\frac{1}{n}$$$. So, $$$n\cdot T(n)=\sum_{i=1}^n T(i)+n$$$. Also, $$$(n-1)\cdot T(n-1)=\sum_{i=1}^{n-1}T(i)+n-1$$$. Hence, $$$n\cdot T(n)-(n-1)\cdot T(n-1)=T(n)+1$$$ which imply that $$$T(n)=T(n-1)+\frac{1}{n-1}=T(n-2)+\frac{1}{n-2}+\frac{1}{n-1}=...=\sum_{i=1}^{n-1}\frac{1}{i}$$$ which is harmonic series. So, this algorithm work in $$$O(log(n))$$$ in average which means that in general it will work like this, but sometimes when you are unlucky it can work longer or if you lucky it can work faster. But in fact, when it is used on practice in a right way it will work with complexity of expectation value.

How to use it

One GM can say how long code will work on codeforces just by looking at it. It can be with 50 ms error, but it is insane already, but how it can be. Assume that we have a code which has complexity $$$O(n)$$$ when $$$n$$$ is up to $$$10^7$$$ you can say that it will work in 1 sec. In fact, it can be 2 sec if constants are really big but it can be also less then a second in opposite case, but it is enough to assume that it will work in 1 sec. Also, it depens on the language you are making code. For example, code in Python will be much slower then in C. But it only partialy tell us how some GM can get the time with so good approximation. In fact, time depens not only on the number of iterations. It is known that + working faster then modulo or / operations but + on one computer can work faster then on other. It is like if you take some old computer then it will take much more time then for the computer that was made yesterday. So, it really depens on the computer, how it is compiling and the code itself. We cannot proccess all this, because it is too much of calculation but we can approximate it intuitevly. So the more you look at it, the better you are at this. But, it should be obvious that $$$O(\sqrt{n})$$$ will be longer then 1 sec for $$$n$$$ up to $$$10^{18}$$$ or that $$$O(log(n))$$$ for same n will work really fast.

Multitest

There is tricky part about multitests. It seems like if you solve one test in $$$O(n)$$$ and there are t tests then solution will work in $$$O(n\cdot t)$$$ which is right but some task include such line "The sum of $$$n$$$ over all test cases is $$$10^5$$$" or something like this. In fact, it is important. Assume that you solved task in $$$O(n)$$$ and n is up to $$$10^5$$$ but $$$t$$$ is up to $$$10^5$$$ also. Then, as you said solution $$$O(n\cdot t)$$$ will make $$$10^{10}$$$ operations at most which is slow in most cases. But if there is a line that sum of all $$$n$$$ is up to $$$10^5$$$, then it you can assume that your solution works in $$$O(n)$$$. Also, there can be that $$$n$$$ is up to $$$10^5$$$ but sum is up to $$$10^6$$$. Here, $$$O(n)$$$ is not really the case. In fact, the first case and this is just $$$O(\sum_{i=1}^{t}n_i)$$$ where $$$n_i$$$ is just time complexity of solution for test $$$i$$$. Then, if sum is up to $$$10^6$$$, your program will make $$$10^6$$$ operation. So, in such tasks $$$O(n\cdot \sqrt{n})$$$ is not a good decision even when $$$n$$$ is up to $$$10^4$$$ but sum is up to $$$10^6$$$ (you can use math and get better approximation in cases like $$$O(n\cdot \sqrt{n})$$$ or other. So it is better to try some solution which may get TLE).

  • Vote: I like it
  • +10
  • Vote: I do not like it

»
2 days ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by Pa_sha (previous revision, new revision, compare).

»
113 minutes ago, # |
  Vote: I like it +4 Vote: I do not like it

I had a random thought about why it's easy to calculate complexity in your head:

Let's say your algorithm has 3 parts with complexities O(f(n)), O(g(n)), O(h(n)), and you know O(f(n)+g(n)+h(n)) = O(max(f(n),g(n),h(n))). so you just need to keep a running-max in your head, which takes O(1) space. In your head you think: "the worst part of my algorithm so far is O(max(...))"