Bin search and relative error

#	User	Rating
1	tourist	3857
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3463
9	ecnerwala	3451
10	heuristica	3431

#	User	Contrib.
1	cry	165
2	-is-this-fft-	161
3	Qingyu	160
4	Dominater069	158
5	atcoder_official	157
6	adamant	155
7	Um_nik	152
8	djm03178	151
9	luogu_official	149
10	awoo	148

Suppose we want to solve a problem by doing binary search on answer. Then the answer will be checked against jury's answer by absolute or relative error (one of them should be smaller then ε). For simplicity we will assume that our answer is always greater than 1 and smaller than B. Because of that, we will always use relative error rather than absolute.

Suppose we have made n iterations of our binary search — what information do we have now? I state that we know that real answer is lying in some segment [x_i, x_i + 1], where 1 = x₀ < x₁ < ... < x_i < ... < x_2ⁿ = B. And what is great — we can choose all x_i except for x₀ and x_2ⁿ.

Now, for simplicity, we will also assume that we will answer x_i + 1 for segment [x_i, x_i + 1] and the real answer was x_i — it is the worst case for us. It is obvious that we will not do that in real life, any other answer would be better, but you will get the idea.

So, what is our relative error? It is $\text{[math]}$ . Worst case for us is when relative error is maximal. It is logical to make them equal — exactly what we do by binary search with absolute errors. $\text{[math]}$ . We can assume that $\text{[math]}$ so $\text{[math]}$ . Now we have $\text{[math]}$ , but $\text{[math]}$ , so $\text{[math]}$ . How large should be n to get error less than ε? $\text{[math]}$ . Much smaller than $\text{[math]}$ .

How to write such binary search? We want to choose m in such a way that $\text{[math]}$ or simply $\text{[math]}$ .

Now I want to deal with some assumptions I made.

How to choose answer in the end? Again, $\text{[math]}$ (it is basically the same as dividing the segment in binary search).

What to do if the answer can be smaller than 1? Try 1; if answer is smaller than 1~--- use standard binary search (because absolute error smaller than relative); is answer is bigger than 1~--- use the binary search above.

P.S. I have never heard about this idea and come up with this while solving 744D - Коровоконг рисует круги. I'm sorry if it is known for everyone except for me.

Comments (12)

Write comment?

mutreta

8 years ago, # |

+51

Even if this idea is known by all top tier competitive programmers, it's always good for the ones that are learning to read something like this.

Nice analysis! Thanks

→ Reply

akumar1503

I think these should be attached as special editorials for the problems,so that noobs like me get it easily in what direction to go!!

Huyum_nik

-37

Let's do less iterations, but replace (l + r) / 2 (what is float sum and decrementing exponent) to sqrt(l * r) (what is float multiplication and taking square root)?

Wow, such a cool optimization.

riadwaw

8 years ago, # ^ |

← Rev. 2 →

+15

Often iteration takes a lot (e.g iteration over long array) so 1 square root per operation may be negligible

Um_nik

+25

I'm pretty sure he understands it.

-25

If you think everybody oughta know everything you know, congrats, you're typical snob.

-45

Is it true for small segments?

TimonKnigge

← Rev. 9 →

+10

We can assume $\text{[math]}$ that so $\text{[math]}$ .

I don't really feel good about this step. The x_i are discrete so Δ is just a difference operator (Δ x_i = x_i + 1 - x_i), where as $\text{[math]}$ comes from differentiating the continuous integral for $\text{[math]}$ . This seems like abuse of notation to me. The whole point of this proof is to determine the correct way of picking the x_i, using information about them that don't follow from any assumptions seems wrong.

Maybe I'm oversimplifying things, but the idea is basically to find a formula for the midpoint which minimizes the worst case relative error you might get. As you already said, if you end up with interval [l, r], the worst you can do is guess r when the answer is l, i.e. the worst possible relative error is $\text{[math]}$ . In other words, we want to find the m that minimizes:

$\text{[math]}$

(Since we don't know if we continue with the left interval or with the right interval, we take the maximum of the worst case relative errors of either interval.)

It's pretty clear that $\text{[math]}$ is increasing in m (it's a linear function) and that $\text{[math]}$ is a decreasing function in m (rewrite as $\text{[math]}$ ), so their maximum is minimal when they are equal, i.e. we have to solve $\text{[math]}$ . Multiply by ml and subtract ml from both sides, and you end up with m² = lr, and since we required that l ≤ m ≤ r, only $\text{[math]}$ is a valid solution (when we assume, as in the blogpost, that l, r > 1.

So clearly following the rule $\text{[math]}$ optimally minimizes the relative error after a single iteration, but of course the question is, does this hold after multiple iterations? (or is this an incorrect greedy solution) So, more generally, in the n-th step of the binary search we want:

(where l functions as x₀ and r as x_2ⁿ)

If you only vary x_i then by logic similar to the above we trivially find that $\text{[math]}$ . It's pretty easy to work out that $\text{[math]}$ is a solution to this system (this is a direct formula for x_i that you get by repeatedly applying the $\text{[math]}$ rule). However, I'm not really sure how to prove that this is the optimal (and only?) solution. Can a more competent mathematician finish the proof?

(pointing out that we have 2ⁿ - 1 equations and 2ⁿ - 1 unknowns (we already know x₀ and x_2ⁿ: l and r) probably doesn't apply for these kinds of systems?)

I don't think it's abuse of notation. wiki
Let's prove that

x_i^p + q = x_i - p^q·x_i + q^p

by induction on $p$ and q. $\text{[math]}$

aaaaajack

Add 1 to each term. Their product becomes r/l, which is a constant, so the geometric mean is trivially optimal.

Alex7

3 years ago, # |

The real issue here (and it keeps coming up over and over in many contexts) is that optimizing with the form (delta X)/X intrinsically breaks some symmetries based on your chosen basis; while simple binary search doesn't.

The final numeric answer if resolvable will be identical, but if you're computing approximations with floating point accuracy, you might get another terminator lock down :)

GusterGoose27

2 years ago, # |

I feel like I have something to contribute here. If our outputted answer is $$$x$$$ and the Jury's answer is $$$y$$$, then the score we get is $$$|1-\frac{x}{y}|$$$. Notice that $$$\ln(\frac{x}{y}) is just \ln(x) - \ln(y)$$$. Since $$$\ln(\frac{x}{y})$$$ increases as $$$\frac{x}{y}$$$ increases. Trying to make $$$\ln(\frac{x}{y})$$$ as close to $$$0$$$ is the same as trying to make $$$\frac{x}{y}$$$ as close to $$$1$$$ as possible. Then, if we take the natural log of both of our endpoints, we are trying to find the value of $$$\ln(x)$$$ in the range $$$[0, \ln(B)]$$$ which is as close to the value of $$$\ln(y)$$$ as possible, which we can clearly do by a standard binary search, and which gives the complexity described above.

Um_nik's blog