https://en.algorithmica.org/hpc/data-structures/b-tree/
As promised, I wrote a new tree data structure.
# | User | Rating |
---|---|---|
1 | tourist | 3993 |
2 | jiangly | 3743 |
3 | orzdevinwang | 3707 |
4 | Radewoosh | 3627 |
5 | jqdai0815 | 3620 |
6 | Benq | 3564 |
7 | Kevin114514 | 3443 |
8 | ksun48 | 3434 |
9 | Rewinding | 3397 |
10 | Um_nik | 3396 |
# | User | Contrib. |
---|---|---|
1 | cry | 167 |
2 | Um_nik | 163 |
3 | maomao90 | 162 |
3 | atcoder_official | 162 |
5 | adamant | 159 |
6 | -is-this-fft- | 158 |
7 | awoo | 156 |
8 | TheScrasse | 154 |
9 | Dominater069 | 153 |
9 | nor | 153 |
https://en.algorithmica.org/hpc/data-structures/b-tree/
As promised, I wrote a new tree data structure.
Name |
---|
I see you're using SIMD, how does it perform without SIMD?
I just tried to replace
rank32
with a scalar binary search within a node (although my binary search is a bit unusual): it is worse but still ~5x faster thanstd::set
(and still beats Abseil by ~1.5x).As far as I understand, your structure in fact works with ints only, so it would be more fair to compare it with structures, designed specifically for ints rather than with
std::set<int>
. So, how does it perform in comparison to vEB tree or X/Y-fast trie or maybe something else?I don't know for sure since I have not compared against them, but I think these particular structures won't be faster even for large arrays. They both perform roughly $$$\log \log 2^{32} = 5$$$ iterations — same as B-trees for all reasonable dataset sizes, — but they are probably slower, at least in their basic implementations:
On top of that, they are not storing the data that efficiently, which is bad for the cache performance.
That said, I think that some universe-reducing approaches may be faster when properly optimized — at least for large enough dataset sizes. Implementing an associative array this way is one of the things I'm going to try next.
Well, that's interesting.
Yesterday, after I posted a comment, I decided to check out fastest solutions for this problem and it seems like all the best solutions implement some kind of the vEB tree.
I am not sure whether it is possible to compile your data structure on their server or not, but all solutions there are public anyway, so it should be still easy to compare your solution with these implementations at least on random tests.
I can't submit it to that problem because I haven't implemented deletions yet, but I took nor's fastest submission there and plugged it into my benchmark — and I was wrong: it outperforms mine when $$$N$$$ is larger than $$$10^6$$$ or so, but the B-tree is ~3x faster for $$$N = 10^5$$$ and anything under that.
(That's for the searches: insertions lose much sooner, but they aren't really optimized in the B-tree to begin with.)