We sort the array in a decreasing order, and calculate the prefix sum p[i]. Our target is to find the minimum index i so that p[i] × 2 is strictly larger than the total sum.
After dividing the given sequence into the first half and second half, we sort both of them in an increasing order, and compare the elements with the same indices, one by one. We should check whether the first half has strictly larger or smaller elements than the second half, at each position.
The general idea is to first sort the given array in an increasing order, denoted as (a1, a2, ..., an). Then, the required sequence should be (a1, a1), (a1, a2), ...(a1, an), (a2, a1), (a2, a2), ...(a2, an), (a3, a1), ...(an, an). For instance, for (1, 2, 3), we have (1, 1), (1, 2), (1, 3),(2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3). However this is not correct if we have elements with the same value. For instance, for (1, 2, 2), it should be (1, 1), (1, 2), (1, 2),(2, 1), (2, 1), (2, 2), (2, 2), (2, 2), (2, 2).
Nevertheless, the above idea still works after some slight modification. Note that the first element can always be determined by calculating ⌈ k / n⌉ (index starts from 0), no matter whether we have same values or not. Suppose that the first element is ai. Then, we find the number of elements that are strictly smaller than ai, denoted as s, and also the total number of elements that are equal to ai, denoted as t. Now, the problem is reduced to finding out the (k - s × n)-th pair in all the t × n pairs, which have ai as the first element. Moreover, the second element of these t × n pairs must be ordered as a1, a1, ...a1, a2, a2, ..., a2, a3, ...an, i.e., any aj has been repeated t times. Thus, it is straightforward to compute the (k - s × n)-th element in this sequence.
This problem involves several classic techniques, such as disjoint set, kruskal's algorithm to find minimum spanning tree (MST), tarjan's algorithm to find bridges in graph, and so on.
Suppose that we have obtained an MST, and we should note that it is possible to replace some edges with other ones which have the same weight, but definitely impossible to use an edge to replace another one with different weight, since we will either obtain a smaller MST (but this is contradictory to our assumption) or a larger MST (this is not an MST!).
Therefore, we can sort the edges in an increasing order of their weight, and adopt the framework of Kruskal's algorithm, but deal with all the edges which have the same weight at the same time. Suppose that edges (ei, ei + 1, ..., ej) have the same weight, and now we start enumerating from ei to ej, and we will have the following two cases.
1) we find that ek connects one of the already connected components (remember that disjoint set is used in kruskal's algorithm), then it is definitely not included in any MST. The reason is that this edge must form a cycle with some other edges which have smaller weight, and if it is included in some MST, we can immediately replace it with any edge from “some other edges” and obtain a smaller MST.
2) we find that ek connects two different connected components. Then, it is possible that this edge must be included in any MST, since it might be the unique edge that connects these two components, which is also referred to as Bridge. On the other hand, it might be only included in some MSTs. The reason is that we may find other edges with the same weight and they also connect these two components.
Thus, we consider the connected components as “nodes” and build a “graph”, and adopt tarjan's algorithm to find out all the bridges. However, this “graph” may contain multiple edges, for instance, several edges connect the same two components. Be careful to handle these multiple edges when using tarjan's algorithm.
We construct a segment tree to solve this problem (do not forget compressing the original data into smaller range so that we can obtain a segment tree with feasible size). Each node “manages” a time interval [l, r], and stores the index of the bus which falls into the current time interval and reaches the farthest distance (also stores this distance).
Next, we sort both buses and passengers all together, in an increasing order of their starting positions (by sorting them together, we can guarantee that whenever we are enumerating a passenger, all the buses that have starting positions ahead of him have been inserted into the segment tree, since only these buses have the potential to send the current passenger to his destination), and enumerate the sorted result one by one. When we meet a bus, we insert it into the segment tree, i.e., we visit the nodes from top to bottom, and update the farthest distance stored at each node (also bus index). When we meet a passenger, we check the segment tree to see whether there exists a bus that can take him to the destination.
I think several issues should be clarified when we try to “query” the segment tree.
1) I think the query complexity is not logN. Instead, it may be quite large. Thus, we might need adopt some “pruning” techniques to reduce the search space.
2) Whenever we find that the farthest distance at the current node is strictly less than the passenger's destination, we should immediately stop the search, since no bus can take the current passenger to his destination; otherwise, whenever we reach some leaf node, we directly return the bus index (since this means that the correct bus has been found); otherwise, we first try to visit the right child node and if it does not return a reasonable bus index, then we continue to visit the left child node (of course the premise condition is that the passenger's time falls into the time interval of the left child node).