Curious if anyone knows how to solve the following problem faster than $$$O(n^2)$$$.
Given an array $$$a$$$ of length $$$n$$$ and an integer $$$n$$$, find any two numbers $$$l$$$ and $$$r$$$ ($$$l \le r)$$$ such that:
- Let $$$a'$$$ be the subarray $$$a_l, a_{l+1}, ... a_r$$$. Then for each $$$x$$$ that appears in $$$a'$$$, $$$x$$$ appears in $$$a'$$$ at least $$$k$$$ times (i.e. $$$k$$$ or more array elements are equal to $$$x$$$).
- The value $$$r-l$$$ is maximized.
Source of problem: I misread https://codeforces.net/contest/1676/problem/F
[deleted] sorry i misread the blog
Yeah bro, I solved this problem only and then looked at output format after writing entire code:
PS: By segment, I mean continuous elements.
Observation: If frequency of any element in the array < k, we surely cannot take this. So the array will be divided into segments. You can only take elements from one segment, and this segment is a repeating problem(not a subproblem).
Solution:
Since every element is added once and removed from the cur_freq array atmost once. So time complexity: O(N*logN).
If all array elements occur at least $$$k$$$ times, then the answer is $$$n-1$$$. Otherwise there is an element that occurs less than $$$k$$$ times and we will "split" the array into subarrays that don't contain that element and recursively solve those.
Store all indices of all array elements in a
map<int,vector<int>> loc;
that we can use binary search on to calculate the frequency of an element $$$x$$$ in an interval $$$[l,r]$$$. Let's write a function $$$count(x,l,r)$$$ that does this.We will write a function $$$solve(l,r)$$$ that calculates the maximum answer on the subarray $$$a[l,r]$$$. We will use a 2-pointer trick with divide and conquer to achieve a good time complexity. Move the $$$l$$$ pointer to the right and the $$$r$$$ pointer to the left until we find an element that occurs less that $$$k$$$ times, and recurse if necessary. The code will look something like this
if we recurse on index $$$i$$$, then the recurrence relation is
$$$T(l,r) = T(l,i-1) + T(i+1, r) + O(min(i-l,r-i)*log(n))$$$
The solution to this recurrence is $$$O(n*log^2(n))$$$, which is the overall time complexity.
Yes, this problem can be solved in $$$O(n \log n)$$$.
First, let's assume we have a function that tells us whether a given subarray $$$l \ldots r$$$ is suitable and if not, tells us the position of an offending element, i.e. one that appears in this subarray more than 0 but less than $$$k$$$ times. Later we will discuss how to implement this efficiently, i.e. in time $$$O(\log n)$$$.
To solve the problem:
The complexity is $$$O(n \log n)$$$: every element is crossed off at most once as the subarrays you recurse on are always distinct.
Now let's discuss how to check if a subarray is suitable. First, we calculate values $$$\mathrm{need}[i]$$$: suppose $$$a_i$$$ is the $$$j$$$-th occurrence of $$$a_i$$$ in the array. Then $$$a_{\mathrm{need[i]}}$$$ should be the $$$j-k+1$$$-th occurrence of $$$a_i$$$ in the array. If $$$j < k$$$, then set $$$\mathrm{need}[i] = -\infty$$$. In other words, $$$\mathrm{need}[i]$$$ is the rightmost possible left endpoint of the subarray if $$$a_i$$$ is included and it is the last of its value in the subarray. How to calculate $$$\mathrm{need}[i]$$$ is left as an exercise to the reader but it is straightforward.
Now we build a persistent segment tree that stores pairs, supports range minimums and point updates. Initialize it with everything set to $$$\langle \infty, \infty \rangle$$$.
The $$$i$$$-th revision of the segment tree is built as follows based on the $$$(i - 1)$$$-th revision:
To check if subarray $$$l \ldots r$$$ is suitable, take range minimum on the segment tree on the range $$$l \ldots r$$$ and revision $$$r$$$. If the first element is $$$\ge l$$$, the subarray is suitable. Otherwise, the second element tells you a faulty position.
Why is it that every time I see a blog like this, it has existed for 9 hours, unanswered, last comment in 3 hours. Then when I start writing my response suddenly there are already 2 comments explaining the solution.
Haha, I thought the same! At least the solutions are a little different
Yes you can solve this in $$$O (n log n)$$$.
Use Sweepline + Min with Range add segment tree. Iterate over the left border $$$l$$$, define $$$weights[i] = 1$$$ if position $$$i$$$ is the first occurrence of $$$a_i$$$ on $$$[l,n]$$$, $$$-1$$$ if position $$$i$$$ is the $$$k$$$-th occurrence of $$$a_i$$$ on $$$[l,n]$$$, and 0 in all other cases. The segment tree will store a prefix sum of this weight. So the rightmost value r such that $$$[l,r]$$$ being valid is precisely the the rightmost value 0 on this segment tree. This can be found with segment tree descent (note that prefix sum of weight is non-negative, so we can find 0s by a descent.). It can be seen that every time you move $$$l$$$ to $$$l+1$$$, you just need to do upto 4 range add/subtracts.