What he didn't show in the main article, only in subsequent code, is that std::u...

nartz · on Jan 8, 2020

Right well, its clear that at least in java, he doesn't pre-allocate the size of the hashmap. Thus, when the hashmap hits its maximum size, of course the resize operation has to copy all of the data into a new map, which makes this no longer O(N) but something like O(NlogN) or O(N^2) depending.

aidenn0 · on Jan 8, 2020

If you do O(N) work every time the hashmap is "full" and double the size of the hashmap, it's still O(N).

Hand-wavy proof:

If your last insert caused a rehash, then you have done O(N) work plus O(N/2) plus O(N/4)... the limit of which is equal to O(2N), and thus only a constant factor more.

matvore · on Jan 8, 2020

No, it's still O(N) amortized. Imagine if the hashmap doubles in capacity whenever it is full or hits its max load factor. Then you end up having O(N) copy operations. O(2 * N) = O(N)

yxhuvud · on Jan 8, 2020

While your objection stands, repeated trainings of a hash table can be very impactful on the actual walltime spent, and it is a bad benchmark not to include the optimization of preallocating.