A 15x faster std::set

→ Pay attention

Before contest
Rayan Programming Contest 2024 - Selection (Codeforces Round, Div. 1 + Div. 2)
6 days
Register now »

*has extra registration

→ Top rated

#	User	Rating
1	tourist	3993
2	jiangly	3743
3	orzdevinwang	3707
4	Radewoosh	3627
5	jqdai0815	3620
6	Benq	3564
7	Kevin114514	3443
8	ksun48	3434
9	Rewinding	3397
10	Um_nik	3396

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	cry	167
2	Um_nik	163
3	maomao90	162
3	atcoder_official	162
5	adamant	159
6	-is-this-fft-	158
7	awoo	156
8	TheScrasse	154
9	Dominater069	153
9	nor	153

View all →

→ Find user

→ Recent actions

Detailed →

sslotin's blog

A 15x faster std::set

By sslotin, history, 3 years ago, In English

https://en.algorithmica.org/hpc/data-structures/b-tree/

As promised, I wrote a new tree data structure.

+159

sslotin
3 years ago
7

Comments (6)

Show archived | Write comment?

Xellos

3 years ago, # |

I see you're using SIMD, how does it perform without SIMD?

→ Reply

sslotin

3 years ago, # ^ |

I just tried to replace rank32 with a scalar binary search within a node (although my binary search is a bit unusual): it is worse but still ~5x faster than std::set (and still beats Abseil by ~1.5x).

→ Reply

nikgaevoy

3 years ago, # |

+59

As far as I understand, your structure in fact works with ints only, so it would be more fair to compare it with structures, designed specifically for ints rather than with std::set<int>. So, how does it perform in comparison to vEB tree or X/Y-fast trie or maybe something else?

→ Reply

sslotin

3 years ago, # ^ |

← Rev. 3 →

I don't know for sure since I have not compared against them, but I think these particular structures won't be faster even for large arrays. They both perform roughly $$$\log \log 2^{32} = 5$$$ iterations — same as B-trees for all reasonable dataset sizes, — but they are probably slower, at least in their basic implementations:

The vEB tree uses recursion and branching, which is expensive.
The x-fast trie uses hash tables, which is even more expensive.

On top of that, they are not storing the data that efficiently, which is bad for the cache performance.

That said, I think that some universe-reducing approaches may be faster when properly optimized — at least for large enough dataset sizes. Implementing an associative array this way is one of the things I'm going to try next.

→ Reply

nikgaevoy

3 years ago, # ^ |

Well, that's interesting.

Yesterday, after I posted a comment, I decided to check out fastest solutions for this problem and it seems like all the best solutions implement some kind of the vEB tree.

I am not sure whether it is possible to compile your data structure on their server or not, but all solutions there are public anyway, so it should be still easy to compare your solution with these implementations at least on random tests.

→ Reply

sslotin

3 years ago, # ^ |

I can't submit it to that problem because I haven't implemented deletions yet, but I took nor's fastest submission there and plugged it into my benchmark — and I was wrong: it outperforms mine when $$$N$$$ is larger than $$$10^6$$$ or so, but the B-tree is ~3x faster for $$$N = 10^5$$$ and anything under that.

(That's for the searches: insertions lose much sooner, but they aren't really optimized in the B-tree to begin with.)

→ Reply