This problem was asked recently in Atlassian OA.
Given an array ; in 1 operation you can add + 1 to any element of the array
This operation costs c[i] if you perform this operation on element at index "i"
Find minimum cost to make all array elements distinct
1<=N<=100000
1<=A[i]<=1000000000
1<=c[i]<=100000
Input —
A — [1 2 2] C — [1 100 500]
O/P — 100 — Do +1 operation on second element making the array — [1 3 2]
Can we solve by processing elements in sorted order and selecting element with the highest cost as the one we don't increment (and increment all other duplicates by one) every time we encounter duplicate elements?
I dont think greedy will work here, we will need dp
Please read this.
even if you do that, the elements you increment may become equal to some other elements already greater than the one you chose to increment
This is not a problem, we can maintain current set of duplicates in priority queue along with sum of their costs and current value of duplicate item. When extracting item from pq we just increment current value and add more duplicates from original array (if present).
what do you think the time complexity of such algorithm be? It'll be $$$O(n^2)$$$ if I'm not wrong.
It is O(n log(n)), every item is added and extracted from pq exactly once
how? if we have an array like 1 1 1 1 then you'll push all these in the queue? then increment the 1s to make the array 1 2 2 2 then you'll have 3 duplicates again? then you'll increment them to make 1 2 3 3? and this will keep going on, how is this $$$O(nlogn)$$$? in the worst case when all the elements are same, it'll take $$$O(n^2)$$$
You don't really have to modify original array, you can just maintain current value of items in pq in a variable. Every time you extract an element just increment this value and see if original array contains items equal to this value. If it does than add those items to pq.
Bro I'm not modifying the array, I'm just showing you the state of array.. That you'll have to increment the same elements more then 1 so it'll not be nlogn
In you example I add all elements to pq and set current_dup_val = 1 (items in pq may have different costs but the same value = 1). After that I extract max cost element from pq and set current_dup_val = 2 (all items in pq have value = 2 now). Repeat the same thing until pq is empty.
can you code it up please? I almost understood your approach, but I'm confused how will you calculate the final cost, will the cost be the sum of all the costs that are left in queue at the end? or will you update the cost everytime you pop out an element by adding the sum of all the costs in the heap when popping the elements?
Sorry, too lazy to code.
Regarding result calculation: maintain sum of all costs currently in pq. When extracting element from pq subtract its cost from that sum and add what is left to you result.
okay, I coded it up, can you check please?
looks good to me
You can do it greedily in one sweep. Iterate over the values (1 to 1000000000), whenever there are multiple entries that have that value push all except the most expensive into a heap and whenever there is a value that doesn't have any index place the most expensive that's currently in the heap on that value and update total cost accordingly. It's in $$$O(n \log{n})$$$ if you don't iterate over all values but jump to the next existing value whenever your heap is empty.
Very hard tbh
no I as a pupul solved it
How about a modified version where you could decrement the values of the array too? That kills your approach I think
What's wrong with sorting and then noticing that only the most expensive smallest element will remain; the rest will be incremented. Then, the only the 2nd most expensive element from all elements <= 2nd smallest will remain, and so on.
For example, if our original array consisted of [1,1,1,2,2,3] then iterate from 1, let the most expensive element remain and all the rest of 1's costs get added and they get moved to 2. Then for 2, only the 2nd highest cost will remain at 2 from all the elements <= 2. Do the same thing and move the rest up to 3. Then for 3, only the third highest cost remains at 3 and so on.
This can be done with a SortedList or heap or whatever. Then, to get the answer, you take the sum of costs in the SortedList and subtract the kth highest cost at element k and accumulate this in an answer variable. This will loop over every element as it must get placed somewhere in O(n log n)?
Nothing, that seems correct.
will it work just sort according to costs and picking up element and inserting in set if already present then find out the next to it which is not present in set then add that cost in O(nlogn)
hey I have a solution
you start from smallest element push its costs into set assign maximum element to this element iterate to next element and push its costs,if no cost exist then assign maximum to it and every time you assign cost add the sum of all elements in your set to your overall cost.when your set is empty just jump to next largest element available to ensure time complexity is within the limit. It works in O(nlogn)
Explanation
Priority queue (pq) is made such that
The element with the smallest value comes first.
If two elements have the same value, the one with the larger cost comes first.
This helps ensure that the most costly operations are performed first when resolving duplicates.
Algorithm
C++ Code
I have personally brainstormed intuition behind this problem and it is correct as per my knowledge. Feel free to point out any shortcomings if you find any.
What is time complexity of your solution?
Why are you guys downvoting this solution. Feel free to raise any doubts if you have any.
The heap/greedy solution is easy to follow. But what if we allow both increment and decrement to the elements in the array? The greedy idea doesn't seem to work anymore, right?