There are $$$m$$$ positive integers that add up to $$$n$$$. On each step take any two numbers, add the minimum of the two to the result and replace those two numbers with their sum (the size of the number set decreases by one). Repeat that until there is only one number left. Why is the result at most $$$n \log_2 n$$$? I have proof below which I don't know is correct or not. But I want to understand the intuition behind this one. Is there somewhere where I could read about this?
My idea is to use induction and see for myself that for small $$$n$$$ this works. Look at the last merge, we will have $$$n_1$$$ and $$$n_2$$$ such that $$$n_1+n_2=n$$$ and $$$n_1 \leq n_2$$$. By induction, the result by now is at most $$$n_1 \log n_1 + n_2 \log n_2$$$. We want to prove that $$$n_1 \log n_1 + n_2 \log n_2 + n_1 \leq n \log n$$$. To see that we can define $$$k = \frac{n_1}{n} \leq \frac 1 2$$$. We now want to prove:
But we know that $$$k \leq \frac 1 2$$$ meaning $$$\log k \leq -1$$$ and hence $$$1 + \log k \leq 0$$$ and now it's clear that both summands in the last line are at most 0.
Lets call ($$$A = small subset, B = big subset$$$) and ($$$x = |A|, y = |B|$$$)
If we want to merge $$$C = A \cup B$$$ that contain some vertice $$$v\ (1 \leq v \leq n)$$$
We have $$$|C| = x + y \geq x + x = 2x$$$. So the size keep increase atleast twice when merging. And there are only $$$n$$$ vertices, so we merge atmost $$$O(log\ n)$$$ times. Since there are atmost $$$n$$$ vertices in $$$C$$$, the complexity become $$$O(n\ log\ n)$$$.
If we have to delete small subset. Firstly, some data structures can be deleted in $$$O(1)$$$ by reallocation, while some others cant delete more than what it add too, so it is basically atmost $$$O(n\ log\ n)$$$ delete time
I read from this blog, this original, Arpa comment, radoslav11 comment, tuwuna comment
And also savinov claim that some problems solved with small to large can be optimize in Linear time
Thanks!