According to Wikipedia, an RMQ can be built with $$$O(n)$$$ memory and time precomp that can answer queries in $$$O(1)$$$. The idea is to divide the array into $$$log(n)/4$$$ blocks and then build an RMQ/sparse table normally on each of the blocks using $$$O(nBlocks*log(nBlocks))$$$ time and memory. In this case, $$$nBlocks$$$ is roughly $$$n/log(n)$$$ so the memory and time used to precomp everything is linear. By doing this, we can find the min of all completely covered blocks. That part sounds easy.
The hard part is dealing with the blocks that a query starts and ends in. Wikipedia says that I need to map each block to its corresponding Cartesian tree, and then I can precompute all the answers for every Cartesian tree. I see how, if I could do that, it would be easy to answer queries in $$$O(1)$$$. As I understand it, I could just store which index for a given Cartesian tree contains the relevant min for all pairs of queries within that subarray in $$$O(blockSize^2 * nCartesianTrees)$$$.
But how do I go about quickly (and hopefully cleanly) mapping a subarray to its corresponding Cartesian tree?
Side question: Are there even performance gains to be made here, or am I just wasting my time trying to get rid of this log factor?
Auto comment: topic has been updated by SecondThread (previous revision, new revision, compare).
Is it enumerated?
There's a better way to go about it, it's described in this Topcoder article.
The method they describe only applies to arrays where adjacent elements differ by +1 or -1, which is a very severe restriction.
They also show how to reduce the general problem to the restricted version. See the "From RMQ to LCA" section.
Oh, I see now. That’s pretty clean; thanks for the link!
You can detect the Cartesian tree with the algorithm presented here starting at slide 119, and you can represent a tree on $$$N$$$ nodes in $$$2N$$$ bits using a balanced parentheses representation.
Some thoughts on practicality from recent discussion with neal:
Here you can find how to calculate the part for in-block-queries. But there is another way to do this without the "cartesian" tree way. For each position $$$i$$$ inside a block calculate a binary mask with the bits at positions $$$b_0 > b_1 > \ldots > b_k$$$ set to one if $$$a[i] > a[b_0] > a[b_1] > \ldots > a[b_k]$$$ ($$$i > b_0$$$), here $$$b_j$$$ is maximum. Those masks can be represented in an integer if the size of a block is less than the size of a word in c++. Then to get the minimum position in the interval $$$[l, r]$$$ inside a block, you only have to find the smallest active bit $$$b$$$ in $$$mask[r]$$$ such that $$$b\ge l$$$. If there is none active bit satisfying this condition the answer will be $$$r$$$.