Any optimzations to this problem ? - Codeforces

→ Pay attention

Before contest
Codeforces Round 1006 (Div. 3)
3 days
Register now »

→ Top rated

#	User	Rating
1	tourist	3856
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3462
9	ecnerwala	3451
10	heuristica	3431

Countries | Cities | Organizations

→ Top contributors

#	User	Contrib.
1	cry	167
2	-is-this-fft-	162
3	Dominater069	160
4	Um_nik	158
5	atcoder_official	157
6	Qingyu	155
7	djm03178	151
7	adamant	151
9	luogu_official	150
10	awoo	147

View all →

→ Find user

→ Recent actions

Detailed →

d4rkc0de's blog

Any optimzations to this problem ?

By d4rkc0de, history, 9 years ago, In English

In English

Hello CF's . I was trying to find some optimizations to the this problem

So we have n strings , we are trying to find all rare subset of letters of theses strings.

a rares subset is a subset of letters x of a string s, such that the number of appearances of x in all strings is strictly lower than a number min_appear.

example :

let's say we have 3 strings and min_appear = 2

abcd abc cde

than the answer is e,ad,bd,ce,de,abc,abd,bcd,abcd,cde (because the number of their appearances = 1 < 2)

these are all the possible subset of letters and their number of appearances { a=2 ,b=2 ,c=3 ,d=2 ,e=1 ,ab=2 ,ac=2 ,ad=1 ,bc=2 ,bd=1 ,cd=2 ,ce=1 ,de=1 ,abc=2 ,abd=1,bcd=1,abcd=1,cde=1}

The best solution i come with have a n*2^(length of taller string) , i'm wondering if there is a solution with a better complexity.

UDP : Sorry i forgot about subsets of length 3 and 4 . also each two different letters from each string are distinct.

Tags

optimization

0

d4rkc0de
9 years ago
4

Comments

Comments (4)

Write comment?

»

9 years ago, # |

Vote: I like it

0

Vote: I do not like it

We need some more details about this task. Either you're not asked to report all subsets (which I understood as subsequence?) but count them or it must have some pretty low constraints, such as in string length, number of different characters, or min_appear.

Suppose the case where there's only one string s = "abc...xyz". There are $\text{[math]}$ distinct subsequences of length k. Let's also say min_appear = |s| + 1.

Then reporting all this subsequences would take $\text{[math]}$ characters which for |s| = 26 is about 872 million. A bit too much, don't you think?

→ Reply

»

»

9 years ago, # ^ |

← Rev. 2 →

Vote: I like it

+5

Vote: I do not like it

I think that by subsets, he means a subset of letters. In this case we have 2²⁶ subsets, and every string can be represented simply with a mask indicating whether it contains each letter or not.

A simple solution involving bitmasks would have complexity O(N * 2²⁶), but we would need to know more constraints to see if that's enough (I doubt it).

Anyway, I still don't understand why he didn't mention other possible subsets for that example test case, like "abc", "cde" or "abcd".

→ Reply

»

»

»

9 years ago, # ^ |

← Rev. 4 →

Vote: I like it

0

Vote: I do not like it

I added the missing subsets , thanks for pointing it .

By a subset i mean a subset of letters .

I know about the O(N * 2^26) solution . i'm wondering if there is a better complexity ?

→ Reply

»

9 years ago, # |

← Rev. 4 →

Vote: I like it

+3

Vote: I do not like it

It's obvious that if x is rare than all substrings that contain x are rare ( in your example e is rare so ce,de,cde are also rare ) , this is can be helpful not to generate all possible substrings in such case , so one we have found the first rare we can generate all it's ascendants, right ??

→ Reply