Online solution for each query for finding maximum frequent value in a range

→ Pay attention

Before contest
Codeforces Round 1006 (Div. 3)
2 days
Register now »

→ Top rated

#	User	Rating
1	tourist	3856
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3462
9	ecnerwala	3451
10	heuristica	3431

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	cry	167
2	-is-this-fft-	162
3	Dominater069	160
4	Um_nik	158
5	atcoder_official	157
6	Qingyu	156
7	djm03178	152
7	adamant	152
9	luogu_official	151
10	awoo	147

View all →

→ Find user

→ Recent actions

Detailed →

LittleMaster_7's blog

Online solution for each query for finding maximum frequent value in a range

By LittleMaster_7, history, 9 years ago, In English

SPOJ — Most Frequent Value

I solved this problem using Mo's algorithm.

Is there any Online solution for each query for this problem.

LittleMaster_7
9 years ago
24

Comments (23)

Show archived | Write comment?

UnknownNooby

9 years ago, # |

← Rev. 3 →

+21

I only know the solution in $\text{[math]}$ .

Tried to find better solution for quite a long time, but no results yet :(

→ Reply

downvoteplz

9 years ago, # ^ |

Can you describe it?

→ Reply

UnknownNooby

9 years ago, # ^ |

← Rev. 2 →

+31

Note: (I consider all values in array ≤ N. If that is not true, then you can use hashmap instead of array or with $\text{[math]}$ precalculations you can make that so)

Split your array into $\text{[math]}$ blocks with size K. Precalculate answers for all intervals between all beginnings of these blocks along with array cnt[x] which tells you how many numbers x are in that interval. You can do that simply in linear time for every interval.

We have wasted $\text{[math]}$ time and memory by this point, what can we do now?

Let's consider we have query [L;R] such that R - L ≥ 2·K (Otherwise we can do linear search to calculate number of occurrences for every value on interval.) Now we know for sure that one of precalculated arrays is completely inside of our query. To be exact, this array covers $\text{[math]}$ . (Further I'll call these borders [A;B])

Notice that A - L ≤ K, same is for R - B ≤ K We can simply use the precalculated array for [A;B], recalculate the value for [L;R] with linear approach which will work in O(K) for each query.

Note

To sum up: This approach works in $\text{[math]}$ by choosing $\text{[math]}$ you can get the time complexity I was talking about.

Note 2

→ Reply

MadhuramJ

5 years ago, # ^ |

Same approach with a different k can be used

Lets say k=N^z and Q= N^y (y<=1) Total Complexity=O( N^(z*z — 2*z +2) + N^(z+y) ) We would want the two powers to be the same Equating we get: z*z — 3*z + (2-y) = 0

we get z= ( 3 — sqrt(4*y+1) ) / 2

for y=1(Q=N) , z=0.38 Total complexity = O(N^1.38)

→ Reply

nandrewjh

9 years ago, # |

+18

is there any offline without mo's?

→ Reply

harish_dalal

4 years ago, # ^ |

-7

The recent solution for the problem https://codeforces.net/contest/1514/problem/D proposed by galen_colin in his recent stream on youtube, worth watching it.

→ Reply

Enchom

9 years ago, # |

← Rev. 5 →

+100

Thought the problem is interesting so here is what I came up with — should be $\text{[math]}$ .

We have an array $\text{[math]}$ of size $\text{[math]}$ , and $\text{[math]}$ queries.

Let's start off by choosing some constant $\text{[math]}$ . We will do some heavy precomputing that we'll split in two parts:

First precomputation

Define the following function:

$\text{[math]}$ = the minimum index $\text{[math]}$ such that the most frequent value in $\text{[math]}$ occurs exactly $\text{[math]}$ times.

We want to compute this function for all $\text{[math]}$ and $\text{[math]}$ . This can be done in $\text{[math]}$ since we can do something similar to a two-pointer walk for a fixed $\text{[math]}$ . I'll omit details, but feel free to ask.

Second precomputation

The first part of our precomputation will help us answer queries whose answer is quite small. So we'll now have to do something about queries with a large answer. Suppose that we create a set $\text{[math]}$ that contains all values of $\text{[math]}$ which occur in $\text{[math]}$ more than $\text{[math]}$ times, and denote its elements by $\text{[math]}$ . Obviously, this set will have size of at most $\text{[math]}$ , that is $\text{[math]}$ . Now let's define the function:

$\text{[math]}$ = the minimum index $\text{[math]}$ such that $\text{[math]}$ and $\text{[math]}$

We want to compute this function for all $\text{[math]}$ and $\text{[math]}$ . This can again be done in $\text{[math]}$ by using a DP-like approach and moving from the end to front of the array for every fixed $\text{[math]}$ .

Answering the queries

Now let's start answering queries. Suppose we get a query $\text{[math]}$ to $\text{[math]}$ . Suppose that we want to check if there is some value that occurs at least $\text{[math]}$ times. Well, for $\text{[math]}$ we can simply check whether $\text{[math]}$ . If it is — then there is a value that occurs at least $\text{[math]}$ times, and otherwise there isn't one.

In such case, we can straight away check whether $\text{[math]}$ and if that's false, then we know that the answer is less than K and we can just iterate on all $\text{[math]}$ and find the largest value that works. That would take $\text{[math]}$ .

However, if we have $\text{[math]}$ , then the answer is at least $\text{[math]}$ , but may be larger. Well, in that case we will check each of the numbers in $\text{[math]}$ , as if the answer is larger than $\text{[math]}$ , then surely one of them is the most frequent number.

Using the $\text{[math]}$ function we can easily find the number of occurrences of $\text{[math]}$ in our segment for some $\text{[math]}$ in $\text{[math]}$ ¹. In such case we can find our answer in $\text{[math]}$ by checking every element of $\text{[math]}$ .

Resulting solution and theoretical complexity

We have total precompute complexity of $\text{[math]}$ and each query is answered in either $\text{[math]}$ or $\text{[math]}$ . The total complexity in the worst case is $\text{[math]}$ . It is plain simple to see that if we set K = $\text{[math]}$ we get worst case complexity of $\text{[math]}$ .

My experience

My coding and explaining are a bit rusty so writing the code and this comment took me an hour each. I ended up getting AC on the problem but with a lot of time limits prior to that. I had to optimise the code a bit to get it accepted. An example optimisation is to solve the first case of queries in $\text{[math]}$ by binary search instead of linear search. I also had to pick the constant K from the program depending on the test case in order to make it run quicker, as in practice $\text{[math]}$ may not always be the best.

Obviously, even if the solution is asymptotically as good as Mo's algorithm offline solution, the constant is much higher, hence it's a few times slower and the SPOJ problem time limit is quite tight. Another downside of the solution is that it takes a lot of memory, but luckily the SPOJ problem had a very large limit.

Feel free to ask any questions and sorry if I've omitted too many details. Notify me about any mistakes too, as this comment took way too much time and I don't have time to proofread.

¹ Note: To be able to quickly find the number of occurrences of $\text{[math]}$ in some segment $\text{[math]}$ to $\text{[math]}$ we'll have to precompute another array:

$\text{[math]}$ = the amount of indices $\text{[math]}$ such that $\text{[math]}$ and $\text{[math]}$ .

For example the sample array in the SPOJ problem {1, 2, 1, 3, 3} would yield and ID array of {2, 1, 1, 2, 1}. In a sense, we're just numbering the values of each kind backwards. Now, let's set $\text{[math]}$ and also set all $\text{[math]}$ if there is no valid index $\text{[math]}$ according to the definition of the $\text{[math]}$ function.

Now, magic! If we want to find the amount of occurrences of $\text{[math]}$ in the segment $\text{[math]}$ to $\text{[math]}$ we simply take $\text{[math]}$ and that is our answer.

→ Reply

shaheen_bd

9 years ago, # ^ |

+13

Well explained. :)

→ Reply

shaheen_bd

9 years ago, # ^ |

← Rev. 2 →

+16

If there is update of any value, then how to solve it ?

→ Reply

Errichto

9 years ago, # ^ |

+24

And linear memory please.

Seriously, isn't this problem already hard enough? Encho's solution is quite complicated (and btw. I tried to solve it yesterday, spent 40-50 minutes and didn't succeed) and you just casually ask "ok, what if we also change values".

→ Reply

shaheen_bd

9 years ago, # ^ |

+28

I also tried to implement that idea , but failed :(

→ Reply

Azret

9 years ago, # ^ |

+15

Wow, that's really cool :) Liked magic section much

→ Reply

LittleMaster_7

9 years ago, # ^ |

Thanks a lot , nice idea .

→ Reply

bicsi

9 years ago, # ^ |

← Rev. 2 →

Great stuff! Although note that there is no need for the Next matrix, as some prefix sums would do the trick just fine! I wonder if there is some preprocessing that would let us find out information about the "frequent" elements faster than O(numberofelements). I doubt it, but it would surely be interesting to check out!

EDIT: By keeping the prefix matrix as n vectors of size sqrt(n) the solution will be very cache friendly for the big values.

→ Reply

johnchen902

9 years ago, # |

+18

There is an entry on Wikipedia: Range Mode Query.

→ Reply

johnchen902

9 years ago, # ^ |

← Rev. 2 →

+38

That page mentioned an O(n) space and $\text{[math]}$ method.

Theorem 1 Let A, B be any multiset. $\text{[math]}$ .

Proof Trivial

Now assume we have an array A of size n. Split it into $\text{[math]}$ blocks, each of which sized $\text{[math]}$ . Precompute the mode and frequency of each consecutive blocks. It took O(n) space and $\text{[math]}$ time.

For each query, we have a prefix, a span and a suffix. By Theorem 1, the mode must be the mode of the span, an element of the prefix, or an element of the suffix. For each element in the prefix or the suffix, check if it is more frequent than the current mode. With additional preprocessing and analysis, $\text{[math]}$ per query can be achieved.