What is the proof for optimal merge pattern problem?

#	User	Rating
1	jiangly	3846
2	tourist	3799
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3590
6	Ormlis	3533
7	Benq	3468
8	Radewoosh	3463
9	ecnerwala	3451
9	Um_nik	3451

#	User	Contrib.
1	cry	165
2	-is-this-fft-	161
3	Qingyu	160
4	atcoder_official	157
5	Dominater069	156
6	adamant	154
7	Um_nik	151
7	djm03178	151
9	luogu_official	149
10	awoo	147

Hi everyone. I have been stuck on this problem for quite sometime and would like some help from the community. For those who aren't familiar with the name, let me define the problem quickly.

You are given a set of n numbers. In a single merge operation you can remove any 2 elements from the set and put their sum back inside. The cost of each merge operation is also equal to the sum. Repeat this until there is only 1 element left in the set. The total cost is the sum of cost of individual operations. We need to minimize the total cost.

Example-

Intitially : cost=0, v={1,2,3}

merge 1 & 2, cost+=3, v={3,3}

merge 3 & 3, cost+=6, v={6}

So finally,cost=9

So basically the greedy approach works here. In each iteration just merge the 2 smallest elements. I sort of get the intuition but I can't prove why this works. I can see that this can be converted to the Huffman Coding problem but the math in proofs is beyond me. I would really appreciate if someone could give me a less mathy solution...

Thank you for your time!!

Comments (15)

Show archived | Write comment?

DontLookBack

4 years ago, # |

← Rev. 4 →

This is the same as this for fixed array though.

Idk about why greedy works but you can always try proof by AC :) Sure that there is some exchange argument lurking out there.

Solution

#include <bits/stdc++.h>
using namespace std;
#define fast ios::sync_with_stdio(0); cin.tie(0); cout.tie(0);
#define ll  long long
#define all(v) v.begin(),v.end()
#define F first
#define S second
#define pb push_back
#define mp make_pair
#define pi pair<int,int>
#define pii pair<int,pi>
#define REP(i,n) for(int i=0;i<n;i++)
 
ll dp[405][405];
ll p[405];
bool seen[405][405];
 
ll solve(int i,int j){
	if(seen[i][j])
		return dp[i][j];
	ll ans = 1e18;
	for(int k=i;k<j;k++){
		ans = min(ans,solve(i,k)+solve(k+1,j));
	}
	if(j-i>0)
		ans+=p[j]-(i > 0 ? p[i-1] : 0);
	seen[i][j]=true;
	return dp[i][j]=ans;
}
 
void run_case(){
	int n,i,j,k;
	cin >> n;
	vector<int> a(n);
	memset(dp,0LL,sizeof(dp));
	memset(seen,false,sizeof(seen));
	ll sum=0;
	for(i=0;i<n;i++){
		cin >> a[i];
		sum+=a[i];
		p[i] = (i>0?p[i-1]:0)+a[i];
		dp[i][i]=a[i];
		seen[i][i]=true;
	}
	cout<<solve(0,n-1)-sum<<"\n";	
 
}
 
int main(){
	fast;
	int tt=1;
	while(tt--){
		run_case();
	}
	
	
}

→ Reply

codeMechanix

4 years ago, # ^ |

I don't think these are equivalent problems...in your question, the relative positions of the elements are fixed, however in the problem that I described, it is not.

radoslav11

← Rev. 2 →

+42

wrong

+11

Elegant! Thank you so much.

I really struggle trying to prove correctness of greedy algorithms.

clyring

+14

I don't understand how this idea is meant to work in the case where $$$(a_1+s_1)$$$ is not immediately merged with $$$(a_2+s_2)$$$. Suppose, for example, that $$$(a_1+s_1)$$$ is merged first with $$$x$$$ and only then with $$$(a_2+s_2)$$$. Then, you can't mean to merge $$$s_1$$$ with $$$x$$$, then $$$(s_1 + x)$$$ with $$$s_2$$$, and finally merge $$$a_1$$$ with $$$a_2$$$ and $$$(a_1 + a_2)$$$ with $$$(s_1 + x + s_2)$$$, because that has a greater cost, by $$$x - a_1$$$.

A strategy that does work is to notice that the final cost of an optimal merging strategy is $$$\sum_{i=1}^n c_i a_i$$$, where the coefficient $$$c_i$$$ is the number of merges $$$a_i$$$ plays a role in. We can assume without loss of optimality that $$$a$$$ is sorted ascending and that $$$c$$$ is sorted descending, since we are re-arrange $$$a$$$ before merging at no cost otherwise. Now, suppose that the first merge that involves the smallest element $$$a_1$$$ is $$$a_1$$$ with $$$x$$$, and let $$$j$$$ be the any index such that $$$a_j$$$ was (directly or indirectly) merged into $$$x$$$. Then, the merges $$$a_j$$$ was involved in are a superset of those $$$a_1$$$ was involved in, so $$$c_j \geq c_1$$$. But since $$$c$$$ is assumed to be sorted descending, this means that $$$c_j = c_1$$$ and thus that $$$a_j$$$ is not involved in any merges not involving $$$a_1$$$, i.e. $$$x$$$ is not the result of a merge at all, and $$$a_1$$$ was merged directly with $$$a_j$$$ Moreover, since $$$j > 1$$$, we have that $$$c_1 \geq c_2 \geq c_j = c_1$$$, so that the coefficients $$$c_2$$$ and $$$c_j$$$ are equal. Thus, we can swap the second-smallest element $$$a_2$$$ with $$$a_j$$$ without changing the cost, and re-order the resulting direct merge between $$$a_1$$$ and $$$a_2$$$ to be at the beginning.

By $$$s_1$$$ and $$$s_2$$$ I meant some sets of elements and not necessarily a single element.

+13

Right, I mis-borrowed your notation. But I still don't understand how this can work.

Let me give a concrete example, with the original multiset being $$$\{101, 102, 104, 108, 116, 132\}$$$, which has the following as one optimal merge order: $$$101 + 104 \to 205$$$, $$$205 + 116 \to 321$$$, $$$102 + 108 \to 210$$$, $$$210 + 132 \to 342$$$, and $$$321 + 342 \to 663$$$, with a total cost of $$$205+321+210+342+663 = 1741$$$. Then, $$$a_1 = 101$$$, $$$s_1 = \{a_1, 104, 116\}$$$, $$$a_2 = 102$$$, $$$s_2 = \{a_2, 108, 132\}$$$, right?

If we merge the "something-else" parts together in the same way, we get $$$104 + 116 \to 220$$$, $$$108 + 132 \to 240$$$, and $$$220+240 \to 460$$$. Then, merging $$$(a_1 + a_2)$$$ is $$$101 + 102 \to 203$$$, and finally everything together $$$460 + 203 \to 663$$$. Is this is what you've described? But its total cost is $$$220 + 240 + 460 + 203 + 663 = 1786$$$, which is strictly greater than the original cost.

So, either I'm misunderstanding you, or your reasoning is totally wrong.

Yeah you're right, it seemed simpler than it actually is.

Thanks a lot for sharing your approach!

BurnedChicken

+47

There is another proof of the correctness. We create a node for every number in the set. If we merge two nodes $$$u$$$ and $$$v$$$, create a new node and make $$$u$$$ and $$$v$$$ its children. In the end we have a rooted tree whose leaves are original elements in the set. It is quite obvious that every leaf contribute to the total cost its depth times (with the root having depth 0). So for every possible rooted tree we must put the smallest 2 numbers the deepest in the tree, which means they should be merged first.

Thank you for sharing your approach!

Nezzar

Interesting that noone mentioned this (solution of) problem is exactly Huffman encoding, did I make something wrong? https://en.wikipedia.org/wiki/Arithmetic_coding#Huffman_coding

mattagar6

+21

Mr.Daddy mentions he knows of it in the last paragraph of the blog.

Oh yea, sorry for being blind :(

+10

I made the connection, but got scared off by a dense packing of mathematical symbols in the proofs lol.

codeMechanix's blog