bihariforces's blog

By bihariforces, history, 14 months ago, In English

If you have not seen already, DQUERY is a problem on SPOJ which asks us to query the number of distinct elements present in a range.

There are ways to solve it online, which might also be possible with what I'll be discussing today, but for sake of explanation we'll solve it offline.

Offline way to solve DQUERY involves use of Fenwick Trees, which is already a well-known technique but we'll try to formulate the solution ourselves in a much more intuitive way.

Let us forget about queries, assume we've a hypothetical $$$N \times N$$$ grid $$$distinct$$$ where $$$distinct[i][j]$$$ is the number of distinct elements present in subarray $$$A[i,j]$$$.

Now we'll use the technique of individual contribution, ie. rather than taking a range and counting distinct, we take an element and find all ranges where this element will contribute an answer to, In order to do that we need a way to resolve duplicates, ie. if there are duplicates only first occurrence should be counted, this makes it very easy to avoid overcounting.

An element $$$A[i]$$$ contributes to all subarrays where it is the first occurrence of itself, ie. any subarray which starts after previous occurrence of $$$A[i]$$$ and contains index $$$i$$$.

If we try to increment contribution of $$$A[i]$$$ in our hypothetical $$$distinct$$$ grid, we see that each cell $$$distinct[x][y]$$$ where $$$prev[arr[i]] + 1 \le x \le i$$$ and $$$i \le y \le N - 1$$$ is getting incremented.

This increment is a $$$2D$$$ range update on rectangle, and the rectangle's right edge always lies on grid's right edge.

Assuming we somehow updated our grid for each $$$A[i]$$$, answer for every possible query would have been precomputed, but unfortunately that isn't feasible.

The problem statement is now, Given an $$$N \times N$$$ 2D grid and $$$N$$$ range 2D increments, find value of given $$$Q$$$ cells after performing all increments.

And this is easy to solve offline, first store all increments as an $$$add$$$ operation on it's beginning row and an $$$end$$$ operation on it's ending row, let us use a Fenwick Tree to maintain only current active row of the grid, then while moving down the grid row-wise, the start of a rectangle is merely a range increment on Fenwick tree and end is range decrement, and we only need to query for points, for each query in current row, we query the column in current active row, Range updates & point Queries, ideal use case for Fenwick Tree.

Thus we were able to perform 2D range updates and point queries in an $$$N \times N$$$ grid without actually storing entire grid, and looking at DQUERY problem this way makes it very intuitive to solve offline using Fenwick tree.

Here's the Accepted Solution implemented using this idea in Python. Code

  • Vote: I like it
  • +7
  • Vote: I do not like it

»
14 months ago, # |
  Vote: I like it +1 Vote: I do not like it

I think we can use MO's algorithm for offline or SQRT decomposition for Online , Im not sure btw ^~^

  • »
    »
    14 months ago, # ^ |
      Vote: I like it +1 Vote: I do not like it

    yeah, we can do that in $$$O(N \sqrt{N})$$$, this solution works in $$$O(N \log N)$$$ and goal was to make it intuitive how Fenwick Tree is helping us.