peltorator's blog

By peltorator, 4 years ago, translation, In English

When I was in high school I once learned about wavelet tree and I was really impressed. But over time when I learned more tricks I started thinking: is there any essential need in it? Because it seems like merge-sort tree with fractional cascading solves all the same problems and its time complexity is $$$O(\log n)$$$ which is better than $$$O(\log C)$$$.

So, am I right or not? Does anyone know any cases where it's helpful to use wavelet tree?

  • Vote: I like it
  • +82
  • Vote: I do not like it

»
4 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

Personally, I hadn't seen tasks that could be solved only by Wavelet Tree. Some of them could be solved by persistent segment tree, others could be solved by merge-sort tree, others could also be solved by just segment tree. Maybe, in some cases it's usefull when you forgot persistent segment tree / merge-sort tree.

About merge-sort tree: can someone prove me that cascading trick is faster than merge-sort tree without it?

  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Without cascading you split your segment into $$$O(\log n)$$$ segments and you run a binary search on every one of them. So it's $$$O(\log^2 n)$$$, but with cascading you just make one binary search and then go down the tree as usual so it's $$$O(\log n)$$$.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Thanks a lot. I thought that in practice merge-sort cascading tree works as well as merge-sort tree without it. Personally, I think that tasks that could be solved by wavelet tree could also be solved by other structures.

»
4 years ago, # |
  Vote: I like it +29 Vote: I do not like it

One of my teachers said, that cascading is useless because in practice it works as fast as $$$O(n\log^2(n))$$$. I was revising wavelet tree not long time ago, and it seems like it's the fastest way to solve nondynamic tasks.

»
4 years ago, # |
  Vote: I like it +11 Vote: I do not like it

With wavelet trees, you can answer queries of the form "find the $$$k$$$-th smallest element from $$$a_l, a_{l+1}, ..., a_r$$$" in $$$O(\log C)$$$ time (or $$$O(\log n)$$$ time if you compress the values beforehand). With merge sort trees, you need to binary search for the answer, so you only get $$$O((\log n)^3)$$$ running time (or $$$O((\log n)^2)$$$ with fractional cascading).

The aforementioned queries can be answered with persistent segment trees in $$$O(\log C)$$$. In fact, in contest where you have to write everything from scratch, persistent segment trees would be my go-to data structure for these types of problems. The main disadvantage of persistent segment trees however is that they use a ton of memory. With wavelet trees, you can use bitmasks to reduce the memory usage by a factor of $$$32$$$. The improved cache locality makes this quite fast in practice. My implementation of this can be found here.

If we allow for $$$O(\log n \cdot \log C)$$$ time, wavelet trees can be made to support "activate / deactivate element" updates by using a Fenwick tree instead of prefix sums. With these, many other types of updates, such as updating individual elements, can be implemented in an offline setting. As the second log factor stems from a Fenwick tree (and not some pointer-based balanced tree), the constant factor in the running time is quite small. My implementation.

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +10 Vote: I do not like it

    Instead of using a persistent segment tree, you can also use parallel binary search and compress the values to solve kth smallest number in range in $$$O((n+q)$$$ $$$log$$$ $$$n)$$$ time and $$$O(n+q)$$$ memory. Code: https://pastebin.com/05J9chn0

    • »
      »
      »
      4 years ago, # ^ |
      Rev. 2   Vote: I like it 0 Vote: I do not like it

      Oh yeah, of course! But I've never thought about it! Thanks for noticing.

  • »
    »
    4 years ago, # ^ |
    Rev. 3   Vote: I like it 0 Vote: I do not like it

    Actually, you can solve this "find the $$$k$$$-th smallest element in range" type query in $$$\mathcal{O}(\log n)$$$ time using merge-sort trees. I won't get into details, but you need to think of merge-sort trees as a data structure to be constructed on 2D points (with arrays, the points are (idx, arr[idx])). Now, if you instead build the merge-sort tree on the points (arr[idx], idx), it is possible to answer this query. Code (comments in portuguese): link.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +18 Vote: I do not like it

      In some sense, a wavelet tree is just a merge sort tree built on (arr[idx], idx) instead of (idx, arr[idx]) that additionally uses the fact that idx goes from $$$0$$$ to $$$n-1$$$ to simplify the implementation, so you're basically using a wavelet tree at that point.

  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Thanks for your implementation!

  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    A bit off topic, but your library is really cool. Thanks for sharing :)