X_Nitd's blog

By X_Nitd, history, 3 days ago, In English

Q : Given N containers, each container has balls of different colors, and each color can have multiple quantity in a single container. We need to select A quantity of color C1 , B quantity of color C2, C quantity of color C3 etc.

Penalty is equal to sum of all leftover quantity in a container if something is picked from container, else 0. We need to minimise the penalty and find set of containers which can help acheive minimum penalty.

Example

container 1 — [ (c1, 100), (c2, 100), (c3, 50) ] container 2 — [ (c1, 50), (c2, 50), (c3, 50) ] container 3 — [ (c1, 50), (c2, 50), (c3, 80) ]

we need to pick (c1, 100), (c2, 100)

  1. If we select container 1 -> penalty = 50

  2. If we select container (2,3) -> penalty = (50+80) = 130

So answer would be 50 and set of boxes would be [Container 1]

Constraints :

Container Count <= 100, Color Count <= 1000, Max quantity of each color <= 500

Time Limit : 60 seconds

  • Vote: I like it
  • +17
  • Vote: I do not like it

»
3 days ago, # |
  Vote: I like it 0 Vote: I do not like it

tough question

»
3 days ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by X_Nitd (previous revision, new revision, compare).

»
3 days ago, # |
  Vote: I like it +4 Vote: I do not like it

Is 60sec the time limit for one test ? If that's the case I guess the intended solution is backtracking?

»
2 days ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

Might be wrong but I want to give it a go.

Let-
$$$- m =$$$ the number of diff. colours.
$$$- n =$$$ the number of containers.

Let $$$dp[i][c_1][c_2][c_3]...[c_m] =$$$ $$$c_j$$$ of $$$j^{th}$$$ colour remaining to chose from boxes numbered $$$i$$$ to $$$n - 1$$$.
We need to return the state $$$dp[0][C_1][C_2][C_3]...[C_M]$$$

Making an $$$m$$$ dimensional $$$dp$$$ state is kinda.. whacky as, we don't know $$$m$$$ changes from (question to question).

I was thinking of creating a 2D DP state $$$dp[i][j]$$$ where $$$j$$$ some kind of a hash for $$$c_1, c_2, c_3, ..., c_m$$$

Claim: If you choose to pick from a box $$$i$$$, exhaust it as much as possible, as partially choosing from some colour $$$c_j$$$ from $$$i$$$ and further boxes would not decrease the penalty in any way.

Now the state becomes

include:
dp[i][c_1][c_2][c_3] = min(dp[i][c_1][c_2][c_3], dp[i + 1][c_1'][c_2'][c_3'] + penalty);
exclude:
dp[i][c_1][c_2][c_3] = min(dp[i][c_1][c_2][c_3], dp[i + 1][c_1][c_2][c_3]);

Now, regarding hashing for $$$c_1, c_2, c_3, ... c_m$$$.

Cosider $$$dp[i][3][1][0][4]$$$.
4 different colours, I can represent them as a string as $$$"aaabdddd"$$$ (no $$$c$$$)
I can then hash it to some unique value $$$hashval$$$ and represent it as $$$dp[i][hashval]$$$.

Since $$$1 ≤ m ≤ 1000$$$ we should represent them in a vector of numbers, like {1,1,1,2,4,4,4,4} and hash it.

Any corrections/suggestions welcome!

  • »
    »
    2 days ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    And now calculate the number of states you have in this dp. (It's a big number)

    • »
      »
      »
      2 days ago, # ^ |
      Rev. 2   Vote: I like it 0 Vote: I do not like it

      I was thinking of a top down approach using std::map.
      Something like: std::map<int, std::map<int, int>> memo

»
2 days ago, # |
  Vote: I like it +5 Vote: I do not like it

I'm almost certain that this is NP-hard, and not in the way where DP actually helps you.

I'm guessing that the answer they are looking for is a brute force that returns early if you already know the current set of containers cannot be part of an optimal solution. (Which might in theory be too slow, but might always be fast enough in practice.)

  • »
    »
    46 hours ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    not sure if its fast enough in practice, if we have 100 containers, returning early / pruning wont speed up the process much, because at the end 2^40 iterations will happen anyways and it wont complete in 60s.

»
2 days ago, # |
  Vote: I like it +8 Vote: I do not like it

Probably the fastest solution is linear programming, but I'm not sure that LP solver can be written from scratch on interview.

  • »
    »
    43 hours ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    How would you guarantee that the LP has integer BFS? I think that's a separate proof to be made, also looking at the time limit and the problem itself, this looks like it is NP hard, currently am trying to find a reduction to bin-packing, if it is NP hard, that would automatically imply the LP cannot have integer BFS, in which case the LP solution definitely would be fractional. So at best we could have an approximate solution in polynomial time using LPs.

    What do you think?

    • »
      »
      »
      42 hours ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      I think that in LP form there could be some good heuristics, not relying on "normal" LP methods. Most probably this problem is NP-hard, and imho constraints are too high for deterministic precise solution

»
42 hours ago, # |
  Vote: I like it 0 Vote: I do not like it

Why did we not pick (c1, 100), (c2, 100), or (c3, 50), and the penalty would be just 0?

»
40 hours ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

This looks like a multi-dimensional weight, 0-1 knapsack problem. Consider the given example; those containers we don't use at all should have at most 100+50+50-100 many c1, 100+50+50-100 many c2 and 50+50+80-0 many c3. We are essentially maximizing the total number of items in those containers, subject to that constraint. So one can turn it into a knapsack problem where the items have (w1=100 w2=100 w3=50 value=250), (w1=50 w2=50 w3=50 value=150) and (w1=50 w2=50 w3=80 value=180) respectively, and the weight limits are (W1=100 W2=100 W3=180).