Hello, recently I've come across a problem (not a competition task) and I'm wondering if there is an efficient solution to it. The task follows:
Given a structure like this [ (2,3), (1), (2,3,4) ] you can easily construct all possible combinations: (2,1,2),(2,1,3),(2,1,4),(3,1,2),(3,1,3),(3,1,4)
Now if you're given n tuples of length m, you need to group them in a way that minimizes the ammount of structures that you get as a result.
E.G. n=7,m=3 tuples: (5,3,10),(4,3,2),(0,4,7),(1,4,7),(5,3,2),(4,3,10),(5,2,9) can form groups [ (5,4),(3),(2,10) ], [ (5),(2),(9) ] and [ (0,1),(4),(7) ]
It kind of smells like set cover problem or something like that but I'm not sure.
Could you explain a bit more? I'm not sure if I understand the problem properly.
My understanding is the following:
We have n m-tuples and we must form structures whose union of combinations contains all these tuples and no others. We aim to minimize the amount of structures?
Is that correct?
Additionally, if my understanding is correct, is it okay if a tuple from the input is produced more than once by the combinations from the structures?
Yes, this is correct. Also, a tuple from the input should be produced only once by the combinations
First of all, don't mix up NP with NP-hard. NP is the set of all decision problems that every true instance can be verified in polynomial time.
NP is not really defined on an optimization problem. The decision version of this problem (i.e. given n tuples of length m, can you group them with at most k structures?) is clearly in NP. In other words, if you have a set of structures, you can check whether that set is a valid solution in polynomial time. For each combinations, (e.g. in your case (5, 3, 10), (4, 3, 2), ...), you can try whether that combination matched with at least one structure that you have. This can be done in polynomial time.
The more interesting question is whether this problem is NP-hard. A problem H is NP-hard iff we can reduce every problem in NP to H. If we assume P != NP to prevent the polynomial hierarchy to collapse, this problem is either in P or NP-hard. To prove that this problem is in P, one must find a polynomial time algorithm to solve this. To prove that this problem is in NP-hard, one must be able to reduce another NP-hard problem to this problem.
The optimal solution of the original optimization problem is clearly upper-bounded by n. (since you can just have n groups with each group consisting of each combination). Therefore, the optimization version of this problem is in P if and only if the decision version of this problem is in P.
As I'm typing this, I'm still figuring out whether this problem is in P or NP-hard.
Wait, can't we just create one structure that consist of
[(all possible values for the first tuple), (all possible values for the second tuple), ...]
Seems like I misunderstood the problem then. Shoot.
You can't create such structure because this will create tuples which are not in the input.
e.g. input: (2,3,5),(1,2,4) Output [ (2,1),(3,2),(5,4) ] is wrong because this creates combinations such as (2,2,5), (2,2,4) etc.
Tuples created by the structures should be unique and consist only of tuples from the input.
e.g. structure [ (1,2),(3,4),(5,6) ] creates tuple (1,3,5) so you can't have another structure for example [ (1,7),(8,3),(9,5) ] because this creates (1,3,5) as well. We do not allow duplicates to be created
Ah I see. Now I fully understand the problem.
Fortunately my comment still holds. It's still in NP :p
Haha yeah. Thanks for your help. I ended up using a simple heuristic that basically merges only tuples that differ on a single index ( e.g. (1,3,*) will merge (1,3,5),(1,3,6) etc. ) which seems to be passable for my purposes, but it's good to know that the original problem is a hard one.
I don't think you read his reply thoroughly. The problem being in NP does not imply that it has no quick polynomial solution (which seems that you've assumed). We don't know if it has a fast solution for now.