sslotin's blog

By sslotin, history, 3 years ago, In English
  • Vote: I like it
  • +159
  • Vote: I do not like it

| Write comment?
»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

I see you're using SIMD, how does it perform without SIMD?

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +3 Vote: I do not like it

    I just tried to replace rank32 with a scalar binary search within a node (although my binary search is a bit unusual): it is worse but still ~5x faster than std::set (and still beats Abseil by ~1.5x).

»
3 years ago, # |
  Vote: I like it +59 Vote: I do not like it

As far as I understand, your structure in fact works with ints only, so it would be more fair to compare it with structures, designed specifically for ints rather than with std::set<int>. So, how does it perform in comparison to vEB tree or X/Y-fast trie or maybe something else?

  • »
    »
    3 years ago, # ^ |
    Rev. 3   Vote: I like it 0 Vote: I do not like it

    I don't know for sure since I have not compared against them, but I think these particular structures won't be faster even for large arrays. They both perform roughly $$$\log \log 2^{32} = 5$$$ iterations — same as B-trees for all reasonable dataset sizes, — but they are probably slower, at least in their basic implementations:

    • The vEB tree uses recursion and branching, which is expensive.
    • The x-fast trie uses hash tables, which is even more expensive.

    On top of that, they are not storing the data that efficiently, which is bad for the cache performance.

    That said, I think that some universe-reducing approaches may be faster when properly optimized — at least for large enough dataset sizes. Implementing an associative array this way is one of the things I'm going to try next.

    • »
      »
      »
      3 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Well, that's interesting.

      Yesterday, after I posted a comment, I decided to check out fastest solutions for this problem and it seems like all the best solutions implement some kind of the vEB tree.

      I am not sure whether it is possible to compile your data structure on their server or not, but all solutions there are public anyway, so it should be still easy to compare your solution with these implementations at least on random tests.

      • »
        »
        »
        »
        3 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        I can't submit it to that problem because I haven't implemented deletions yet, but I took nor's fastest submission there and plugged it into my benchmark — and I was wrong: it outperforms mine when $$$N$$$ is larger than $$$10^6$$$ or so, but the B-tree is ~3x faster for $$$N = 10^5$$$ and anything under that.

        (That's for the searches: insertions lose much sooner, but they aren't really optimized in the B-tree to begin with.)