bihariforces's blog

By bihariforces, history, 17 months ago, In English

Given an array $$$arr$$$ of length $$$N$$$, find the sum of count of unique values in a subset for every subset of $$$arr$$$, modulo $$$10^9 + 7$$$.

I can see that the result depends only on frequency of distinct values, and seems like some combinatorial trick, it's a made up question but seems standard, what is the most optimal solution for length $$$N$$$ array?

  • Vote: I like it
  • +4
  • Vote: I do not like it

»
17 months ago, # |
Rev. 9   Vote: I like it +1 Vote: I do not like it

there are two ways to do this one way is dp

assume each number selection as independent process. assume frequency of that number to f so number of ways we can select that number is

fC1 + fC2 + .... fCf = 2^f — 1

now either we chose that number, which will contribute 1 to our unique sum

or we dont chose it which will contribute 0 to our unique sum.

here we can use DP to calculate, but i have better mathematical way

so assume polynomial representation for this (x^0 + (2^{f_i} — 1)x^1)

Function F = PI(x^0 + (2^{f_i} — 1)*x^1) for all i , summation f_i = n obviously

but what we need is summation of coefficient * power — 1, hmm it seems its just derivative and value calculated x = 1. i.e. DF/Dx,x=1

well that's it .

ans = (sum (2^{fi}-1) / ( 1 + 2^{fi}-1) over all fi) * 2^n = (sum((2^{fi} — 1)/2^fi)) over all fi) * 2^n

lets take example i have array [2,2,3] , f1 = 2 ,f2 = 1

our answer is 8 * ( (2^2 — 1) / 4 + (1 / 2)

= 8 * (3/4 + 1/2)

= 10

10 is our answer.

»
17 months ago, # |
  Vote: I like it 0 Vote: I do not like it

What are the constraints?

»
17 months ago, # |
Rev. 7   Vote: I like it +3 Vote: I do not like it

My initial idea was for each element , counting the number of subsets that it does not exist and finding the answer from that. You iterate from left to right in $$$a$$$, let $$$ind_{i,j}$$$ equal to the index of the $$$j$$$th occurence of the element $$$i$$$:

$$$tans_i = \sum_{j=2}^{cnt_i} = (ind_{i,j} - ind_{i,j-1})(ind_{i,j} - ind_{i,j-1} - 1) / 2$$$

Then for each element $$$i$$$ , we know how many subarrays are there such that the subarray doesnt containt $$$i$$$ (which corresponds to $$$tans_i$$$), now what we do is :

$$$ans = \sum_{i=1}^{\max\limits_{i\epsilon [1,n]}{a_i}} = (n * (n+1) / 2) - tans_i$$$

and we print the $$$ans$$$