Hardstone's hard problem

Revision en5, by dfsof, 2023-08-17 06:10:29

Prof. hardstone gives you an integer array $$$A$$$. The length of $$$A$$$ is $$$n$$$ and there are $$$m$$$ distinct numbers in $$$A$$$. Count the number of tuple $$$(l, r)$$$, $$$1 \leq l \leq r \leq n$$$, such that:

Numbers that appear in the interval $$$a[l...r]$$$ appear the same number of times.

For example, $$$A=[1,2,1,2]$$$, then there are $$$8$$$ legal tuples: $$$(1, 1), (2, 2), (3, 3), (4, 4), (1, 2), (2, 3), (3, 4), (1, 4)$$$.

This is an open problem with brain storm. $$$O(n^2m)$$$ brute force using the prefix sum and $$$O(n2^m)$$$ brute force using bitmasks and hashtable are easy to come up with. I am looking for a $$$O(nmlog^k)$$$ solution. Are there any smart data structures?

Note that when $$$A = [1,2,3]$$$, all intervals are legal. For example, $$$[1, 2]$$$ is legal, as both $$$1$$$ and $$$2$$$ appear once. We do not care about $$$3$$$ because $$$3$$$ does not appear.

amenotiomoi proposes a genius randomized idea, which could make my yesterday's idea work: Similar to the Zobrist hashing, we assign a random value to each distinct integer. We record the prefix sum of the hash values in a hashtable (let $$$h_r$$$ be the hash value of $$$a[1...r]$$$). Then, we fix $$$l$$$ and count the number of $$$r$$$ with respect to this $$$l$$$. For each $$$l$$$, we denote $$$p(l, j)$$$ be the first place $$$j$$$ appears after $$$l$$$ (inclusive), somehow like std::string.find(j, l). If $$$j$$$ never appears after $$$l$$$, we let $$$p(l, j) = \infty$$$. For example, if $$$A=[4,1,2,3]$$$, then $$$p(2, 1)=2, p(2,2)=3, p(2,3)=4, p(2,4) = \infty$$$. Note that $$$p(l, j) \neq p(l, k)$$$ if $$$j \neq k$$$. Then, we sort the pair $$${p(l, j), j}$$$ in the ascending order of $$$p(l, j)$$$, and let $$$q$$$ be the sorted list. Therefore, between two adjacent elements of $$$q$$$, the present and absent numbers could be determined. For example, $$$A=[1,2,2,2,3]$$$, $$$r=1$$$, $$$2 \leq r \leq 4$$$, then $$$1, 2$$$ appear and $$$3$$$ is absent. Therefore we need to find the number of $$$r$$$, $$$2 \leq r \leq 4$$$, such that $$$1$$$ and $$$2$$$ appear the same number of times with in $$$a[l...r]$$$. Yesterday I was stuck here. But with the genius hashtable, we only need to count $$$r$$$ that $$$(hashvalue(1) + hashvalue(2)) \mid h_r - h_{l-1}$$$

Tags data structures, string, array

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en18 English dfsof 2023-08-17 10:52:07 7 Tiny change: 'fter $l$, we let $p(l, j) ' -> 'fter $l$, $p(l, j) '
en17 English dfsof 2023-08-17 06:48:21 10
en16 English dfsof 2023-08-17 06:44:25 2 Tiny change: '2,2,3]$, $r=1$, $2 \l' -> '2,2,3]$, $l=1$, $2 \l'
en15 English dfsof 2023-08-17 06:26:34 5 Tiny change: '= O(n(mlogm + mlogn +nlogmlogn' -> '= O(n(mlogn + mlogm+nlogmlogn'
en14 English dfsof 2023-08-17 06:23:56 0 (published)
en13 English dfsof 2023-08-17 06:23:51 5 Tiny change: 'could be find via bina' -> 'could be fould via bina' (saved to drafts)
en12 English dfsof 2023-08-17 06:22:37 0 (published)
en11 English dfsof 2023-08-17 06:22:30 36 (saved to drafts)
en10 English dfsof 2023-08-17 06:19:31 15 Tiny change: 'r$ be the hash value of $a[1..' -> 'r$ be the prefix sum of hash values of $a[1..'
en9 English dfsof 2023-08-17 06:18:58 0 (published)
en8 English dfsof 2023-08-17 06:18:53 1 Tiny change: ' $O(mlogm). For two ' -> ' $O(mlogm)$. For two ' (saved to drafts)
en7 English dfsof 2023-08-17 06:17:52 0 (published)
en6 English dfsof 2023-08-17 06:17:45 691
en5 English dfsof 2023-08-17 06:10:29 739
en4 English dfsof 2023-08-17 06:02:41 86
en3 English dfsof 2023-08-17 06:02:04 542 (saved to drafts)
en2 English dfsof 2023-08-17 05:34:18 185
en1 English dfsof 2023-08-17 05:13:45 695 Initial revision (published)