How to solve the dual of this famous problem?

#	User	Rating
1	tourist	3856
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3462
9	ecnerwala	3451
10	heuristica	3431

#	User	Contrib.
1	cry	167
2	-is-this-fft-	162
3	Dominater069	160
4	Um_nik	158
5	atcoder_official	157
6	Qingyu	155
7	djm03178	151
7	adamant	151
9	luogu_official	150
10	awoo	147

Hello, beautiful people of Codeforces.

Yesterday I spent several hours trying to solve the following problem, which feels that should be an easy dp. I then tried to search it on Google and failed. I feel like it's almost impossible to find solutions to algorithmic problems via search engines, they always return stuff that matches with one or two words from the statement but not the problem itself. Anyway, here's the problem:

You are given $$$n$$$ line segments $$$[l_i, r_i]$$$ ($$$1 \leq l_i \leq r_i \leq 2n$$$) and a number $$$k$$$. You are allowed to select $$$k$$$ points on the number line. What is the maximal number of segments that conitain at least one of the selected points you can obtain?

Example:

The answer is 5. We can select points 4 and 7, hitting all segmets but the last one. Hitting all segments is impossible, as [3, 5], [6, 8] and [10, 12] are disjoint. Note that even though segment [1, 7] is hitted twice, it is only counted towards the solution once.

We are also given that $$$k$$$ is much less than $$$n$$$. I expect the solution to run in $$$\mathcal{O}(nk)$$$ with possibly some $$$\log$$$ s.

Note 1: When k = 1 this is a well know problem and can easily be solved in $$$\mathcal{O}(n \log n)$$$:

Create $$$2n$$$ event of the form $$$(l_i, +1)$$$ and $$$(r_i, -1)$$$;
Sort the events by the first coordinate;
The answer is the maximal sum of the second coordinate over the events with the first coordinate up to some $$$i$$$ (this can be computed with a single loop);

Note 2: I called it a "dual" problem to another well know problem: Given $$$n$$$ line segments $$$[l_i, r_i]$$$ ($$$1 \leq l_i \leq r_i \leq 2n$$$), what is the minimal number of points to hit all the segments?

This "primal" problem is easily solvable in $$$\mathcal{O}(n \log n)$$$ by a greedy, selecting always the first end of not-yet-hit segments.

Comments (8)

Write comment?

Um_nik

2 years ago, # |

← Rev. 2 →

+169

Let's consider a relaxation: Say that there is a variable $$$x_i$$$ for every point ($$$1 \le i \le 2n$$$) which means "how many times we take this point" and a variable $$$y_j$$$ for every segment ($$$1 \le j \le n$$$) which means "how many times we cover this segment". The limitations are:

$$$x_i \ge 0$$$
$$$\sum_{i=1}^{2n} x_i \le k$$$
$$$y_j \le \sum_{i=l_j}^{r_j} x_i$$$
$$$y_j \le 1$$$

and we want to maximize $$$\sum_{j=1}^{n} y_j$$$.

In the original problem $$$x_i$$$ and $$$y_j$$$ should be integers (0 or 1, but it doesn't matter). For relaxation we will drop this requirement, now $$$x_i$$$ and $$$y_j$$$ can be real. This is a linear programming problem (and the original was ILP).
I'm not strictly sure here, but it seems that the answers for LP and for ILP are the same. The reasoning is along the lines of: Let's look at the optimal solution for LP which is not integer, there is a leftmost non-integer $$$x_i$$$, if $$$\sum_{i=1}^{2n} x_i < k$$$ we can just increase it until it either becomes 1 or the sum reaches $$$k$$$, so now we can assume that the sum is $$$k$$$, that means that there should be another non-integer $$$x$$$, let's take the second leftmost, now we can increase one of them by $$$\Delta$$$ and decrease another by $$$\Delta$$$, one of those actions is not making the answer smaller, thus we can eliminate all non-integer variables.

It is easy to see that if we take a valid solution of LP for some value of $$$k$$$ ($$$k_1$$$) and another value of $$$k$$$ ($$$k_2$$$), and choose some $$$0 \le \lambda \le 1$$$, then look at the linear combination of those two solutions with coefficients $$$\lambda$$$ and $$$(1 - \lambda)$$$ we will get a valid solution for value of $$$k$$$ equal to $$$\lambda k_1 + (1 - \lambda) k_2$$$. If we take optimal solutions for $$$k_1$$$ and $$$k_2$$$, let's say that their corresponding values of target function are $$$c_1$$$ and $$$c_2$$$, we will get a valid solution for $$$\lambda k_1 + (1 - \lambda) k_2$$$ with value of target function equal to $$$\lambda c_1 + (1 - \lambda) c_2$$$, which means that the answer for this value of $$$k$$$ is at least that, so answer as a function of $$$k$$$ is concave.

Thus we can apply lambda optimization: binary search on $$$\lambda$$$ and find the optimal solution when the target function is $$$s - \lambda k$$$, where $$$s$$$ is the number of segments we have covered and $$$k$$$ is the number of points we used.

To solve this problem we can use a DP: $$$dp[x]$$$ — best value of target function if the rightmost taken point is $$$x$$$, so further we will only be able to cover segments that has left border strictly to the right of $$$x$$$. It is $$$O(n^2)$$$, but we can optimize it with scanline and segment tree:

We will move current point to the right and in segment tree we will store what will be the value with which we will update DP if we do update from $$$x$$$. When we encounter new segment, we should do +1 for all $$$x$$$ before it. When we encounter an end of a segment, we will do -1 for all $$$x$$$ before its left end. Thus the DP works in $$$O(n \log n)$$$ and the whole solution is $$$O(n \log^{2} n)$$$.

→ Reply

Wielomian

2 years ago, # ^ |

Thank you Um_nik for this solution. I will implement it soon and get back with results :D

bicsi

+51

See this (official solution is $$$O(n \alpha(n) \log n )$$$

Kindly thank you bisci for the reference!

mango_lassi

Could you share the editorial?

← Rev. 4 →

+15

Unfortunately, I never got to write the editorial for the problem, but here's a rough idea:

First, the answer is concave wrt the number of points (bags), so use Aliens trick to convert to the easier problem where you have infinite points (bags) with a given penalty $$$\lambda$$$ per point.

Consider $$$dp(r)$$$ as the best achieveable cost for the subproblem consisting on the subset of ranges that end before $$$r$$$. Naturally, we will process ranges sorted by right endpoint. When considering a new right endpoint, we insert all ranges having right endpoint equal to $$$r$$$. Then, we compute $$$dp(r) = \min_{x} dp(x - 1) + w(x) - \lambda$$$, where $$$w(x)$$$ is the total weight of inserted ranges that have $$$l \leq x$$$.

Formalizing this, we need a data structure that allows for simulating the following operations on a ``vector'' $$$v$$$:

Add $$$x$$$ to a suffix $$$v[i:]$$$
Push back $$$y$$$ to $$$v$$$
Find the maximum value over all $$$v$$$

Let's look at the above operations. More specifically, notice that if $$$v_{i+1} \geq v_i$$$, then $$$v_i$$$ will never be relevant anymore. Then, let's think of a set containing only the relevant positions. This set is always sorted decreasingly.

We can already solve this problem using a set, but this still achieves $$$O(n \log^2 n)$$$ (same as segment tree approach). However, if we keep the deltas $$$v_{i} - v_{i-1}$$$ over this ``set'', then we can instead simulate a list where you remove previous elements and adjust the corresponding deltas. The answer for query 3. is just the first element of this list. This list should support:

Removing an element
Binary search for the first element to the right of some position $$$i$$$ (for operation 1.)
Pushing back an element

We can use DSU to solve this problem in complexity $$$O(\alpha(n))$$$ for all operations.

Let me know if this makes sense, it might be a very loose explanation.

bIeah

Although you don't remember writing an editorial, I know one exists: Link to download some files, among which there is the editorial (in romanian), along the path of Popcorn/Descriere/

mmaxio

+18

https://codeforces.net/blog/entry/51880?#comment-358982

Wielomian's blog