atharvd's blog

By atharvd, history, 17 months ago, In English

I was introduced to this problem by someone on another platform; I have been thinking about the problem for a while, but I don't know what to do.

You are given string $$$T$$$ and a function $$$f(s)$$$. $$$f(s)$$$ takes in a string and outputs $$$len(s) * occur(s)$$$ where $$$len(s)$$$ is the length of the string and $$$occur(s)$$$ is the amount of times string $$$s$$$ occurs in $$$T$$$.

Find the maximum value of $$$f(s)$$$ over all substrings of $$$T$$$. The maximum length of $$$T$$$ is $$$10^5$$$

I am considering using the Z-algorithm to solve this; can someone tell me if this is the correct approach?

  • Vote: I like it
  • +14
  • Vote: I do not like it

»
17 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by atharvd (previous revision, new revision, compare).

»
17 months ago, # |
  Vote: I like it +1 Vote: I do not like it

What is the constraint on the length of the string?

  • »
    »
    17 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Sorry, I forgot to mention that, the length of T is 10^5 at most.

»
17 months ago, # |
Rev. 7   Vote: I like it +23 Vote: I do not like it

I have an idea, but you need to understand suffix array first

Your problem can be rephrased as : find the maximum value of $$$i*g(i)$$$ where $$$g(i)$$$ is the maximum number of occurrences of a string of size $$$i$$$ in $$$T$$$

When you construct the suffix array on $$$T$$$, suppose that the optimal string is $$$s$$$ (i.e $$$f(x)$$$ is maximum at $$$x=s$$$), then $$$s$$$ is going to be a prefix of each suffix in some subarray of that suffix array

Now imagine we have a subarray of suffixes, obviously the longest common prefix (LCP) over all these suffixes is the longest possible valid $$$s$$$ for this subarray, so the problem is reduced to : find the maximum value of $$$(r+1-l)\cdot lcp(l,r)$$$, here $$$lcp(l,r)$$$ is equal to the minimum value over the lcp array from $$$l$$$ to $$$r-1$$$ inclusive when $$$l<r$$$, and obviously $$$lcp(l,l)$$$ is equal to the size of the suffix in position $$$l$$$ in the suffix array

This problem is actually very similar to this cses problem , which is solvable by a Sparse table and D&C (or recursion), if you need help solving the cses problem feel free to ask

Final time complexity will be $$$O(n\log n)$$$, where $$$n$$$ is the length of $$$T$$$

  • »
    »
    17 months ago, # ^ |
      Vote: I like it +6 Vote: I do not like it

    Thank you! I am right now reading about the suffix array so I can understand your post.

»
17 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by atharvd (previous revision, new revision, compare).