Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Capped Hashmap #25
Capped Hashmap #25
Changes from 9 commits
1751c86
d44929c
8929636
bab9e7a
fe21a47
6f8294a
f061aad
3329660
c636aae
c60603d
3eda4cf
181a3ad
81f616c
6f39a59
a0adf49
8b7ab84
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should return the item that was removed, as this might be surprising behavior for something that is supposed to closely mimic a hashmap. If we want this API it should be named something else (
insert_and_get_removed_item
or something)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually quite handy and it can help with us to things like this #2 (comment).
It's similar to what the core HashMap does with their
insert
fn. It returns the value of the replaced item and they don't call it something likeinsert_and_get_replaced_item
.I believe we can can this
add_entry
to avoid any confusion with theHashMap::insert
fn.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not saying this is not a useful function, just that it should have a different name so that it does not have a surprising behavior as
HashMap::insert
does not behave like this.HashMap::insert
returns the element that was removed at the place of insert, which is different!There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have logs when we reach a quarter of the capacity, half of the capacity, 90% of the capacity, or something like that (so that we now something is happening and unfinished signatures are piling up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also imagine
capacity
is set to 5, this means that the hashmap and vecdeque both have capacity 4. So when the insertion above makesinner
andlast_items
full, the check here will be4 > 4 = false
and do nothing. This means that the next insert will resize both the hashmap and vecdeque. I think you made a mistake in the initialization, you should keep the capacity, but have the two structures have capacity + 1 instead. Maybe a test checking for the array capacity would help with making sure that the logic works :)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh shit, yes that's what I intended to do in the first place. Good point!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've read about the capacity and how it changes. There is a good explanation here.
I also tested it and it looks like it can double even though the length remains the same.
I guess the more keys we insert the highest the chance of a collision (even though some keys are removed) so it looks like it follows a conservative approach by doubling the capacity.
I believe this is not a big deal neither a huge performance implication given that we're not gonna be storing several thousand or millions of entries in the hash table. And besides capacity change does not mean entries are moved to a new memory location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it kinda sucks that we have to go through the whole thing here :D I agree that we should assume that the key to remove is one of the last one appended (and that the oldest stuff is probably just stale stuff at this point). Maybe a LinkedList is better as removing something at any index is easier? (so find followed by a remove). In any case I think this would be better:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes this looks much better 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a LinkedList isn't faster on removals.
Also this is what the official docs suggest:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yeah you'd need some sort of linkedlist + hashmap to know which two nodes to update to remove a node :D