Dramatic Difference between Unordered Set and Policy Based Hash Table

→ Обратите внимание

До соревнования
CodeTON Round 9 (Div. 1 + Div. 2, Rated, Prizes!)
33:21:25
Зарегистрироваться »

*есть доп. регистрация

→ Лидеры (рейтинг)

№	Пользователь	Рейтинг
1	tourist	4009
2	jiangly	3823
3	Benq	3738
4	Radewoosh	3633
5	jqdai0815	3620
6	orzdevinwang	3529
7	ecnerwala	3446
8	Um_nik	3396
9	ksun48	3390
10	gamegame	3386

Страны | Города | Организации

Всё →

→ Лидеры (вклад)

№	Пользователь	Вклад
1	cry	167
2	Um_nik	163
3	maomao90	162
3	atcoder_official	162
5	adamant	159
6	-is-this-fft-	158
7	awoo	157
8	TheScrasse	154
9	Dominater069	153
9	nor	153

Всё →

→ Найти пользователя

→ Прямой эфир

Детальнее →

Блог пользователя harryzhengipsum

Dramatic Difference between Unordered Set and Policy Based Hash Table

Автор harryzhengipsum, история, 2 года назад, По-английски

On 704A - Thor, the policy-based hash table easily passes under the time limit, but the unordered set version TLEs on case 56.

With gp_hash_table: 181117819

With unordered_set: 181133673

Both submissions use a custom hash function, so it can't be due to collisions. On test case 56, the unordered_set suddenly gets TLE but it's only slightly slower than the gp_hash_table on all the other test cases. Why is this?

-3

harryzhengipsum
2 года назад
9

Комментарии (8)

Показать архивные | Написать комментарий?

TooFast

2 года назад, # |

It is getting TLE because erasing from unordered_set is slow, a normal set should get AC.

→ Ответить

harryzhengipsum

2 года назад, # ^ |

Is it because unordered_set is returning the iterator after when calling .erase()? Didn't know that it could result in an almost 10x slowdown. Are there any ways to prevent that, or should we simply resort to using gp_hash_table/set?

→ Ответить

Spheniscine

2 года назад, # ^ |

← Rev. 4 →

+12

It's because when an unordered_set is cleared, the entries are removed, but the underlying allocation still remains the same size. Then when you iterate over it, it has to check the entire allocation to find the elements. In other words, iteration is $$$O(N)$$$ for $$$N$$$ being the hashtable's current capacity, not the number of elements actually in it.

So if you add a lot of elements to a set, then clear it or iterate over it many times, it'll be slow.

One way to avoid this is to construct a new set instead of clearing the current set.

Vectors / dynamic arrays do not have this problem, as even if the allocation is much larger than the number of actual elements, it still knows how many elements it has and where they are (at the beginning), so it only has to iterate or clear those. But a hashmap/hashset could have elements scattered anywhere in its allocation, and would typically need extra metadata and implementation if you want it to be able to iterate and clear proportional to number of elements rather than its capacity.

→ Ответить

harryzhengipsum

2 года назад, # ^ |

← Rev. 2 →

But .erase() and .find() should both be $$$O(1)$$$ regardless of hashset/table size. Plus, it TLEs even when the hashsets are cleared by using ={}.

Without using .clear(): 181154447

→ Ответить

harryzhengipsum

2 года назад, # ^ |

← Rev. 2 →

Actually, I got it to work by clearing it via swapping with a temporary empty array. I'm still not seeing where the actual $$$O(N)$$$ bottleneck is though (is it because .clear() is $$$O(N)$$$ each time?), since .erase() and .find() are both $$$O(1)$$$ on a hashset with few collisions.

With swapping: 181154447

→ Ответить

Spheniscine

2 года назад, # ^ |

Yes, clearing is also proportional to the size of the allocation / the capacity, not the number of elements (because the elements could be anywhere in the allocation)

→ Ответить

_iAmSiam_

2 года назад, # |

Do this instead of unreadItems[b].clear()

Spoiler

unordered_set<int, custom_hash>emt;
unreadItems[b] = emt;

→ Ответить

lis05

2 года назад, # |

Everyone hates unordered_set

→ Ответить

Соревнования по программированию 2.0

Время на сервере: 22.11.2024 08:13:35 (k2).

Десктопная версия, переключиться на мобильную.

При поддержке