Skip to content

Newtype hashbrown #18694

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Apr 6, 2025
Merged

Conversation

bushrat011899
Copy link
Contributor

@bushrat011899 bushrat011899 commented Apr 3, 2025

Objective

The changes to the Hash made in #15801 to the BuildHasher resulted in serious migration problems and downgraded UX for users of Bevy's re-exported hashmaps. Once merged, we need to go in and remove the migration guide added as part of #15801.

Solution

  • Newtype HashMap and HashSet instead of type aliases
  • Added Deref/Mut to allow accessing future hashbrown methods without maintenance from Bevy
  • Added bidirectional From implementations to provide escape hatch for API incompatibility
  • Added inlinable re-exports of all methods directly to Bevy's types. This ensures HashMap::new() works (since the Deref implementation wont cover these kinds of invocations).

Testing

  • CI

Migration Guide

  • If you relied on Bevy's HashMap and/or HashSet types to be identical to hashbrown, consider using From and Into to convert between the hashbrown and Bevy types as required.
  • If you relied on hashbrown/serde or hashbrown/rayon features, you may need to enable bevy_platform_support/serialize and/or bevy_platform_support/rayon respectively.

Notes

  • Did not replicate the Rayon traits, users will need to rely on the Deref/Mut or From implementations for those methods.
  • Did not re-expose the unsafe methods from hashbrown. In most cases users will still have access via Deref/Mut anyway.
  • I have added inline to all methods as they are trivial wrappings of existing methods.
  • I chose to make HashMap::new and HashSet::new const, which is different to hashbrown. We can do this because we default to a fixed-state build-hasher. Mild ergonomic win over using HashMap::with_hasher(FixedHasher).

@alice-i-cecile alice-i-cecile added this to the 0.16 milestone Apr 3, 2025
@bushrat011899 bushrat011899 added C-Code-Quality A section of code that is hard to understand or change P-Regression Functionality that used to work but no longer does. Add a test for this! M-Needs-Migration-Guide A breaking change to Bevy's public API that needs to be noted in a migration guide A-Utils Utility functions and types X-Controversial There is active debate or serious implications around merging this PR D-Modest A "normal" level of difficulty; suitable for simple features or challenging fixes S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Apr 3, 2025
Copy link
Contributor

github-actions bot commented Apr 3, 2025

It looks like your PR is a breaking change, but you didn't provide a migration guide.

Please review the instructions for writing migration guides, then expand or revise the content in the migration guides directory to reflect your changes.

@Victoronz
Copy link
Contributor

No need to copy over the inherent methods! We can just implement Deref/DerefMut/into_inner to make the inner methods accessible, like how I did with EntityHashMap.
The offending methods/associated functions we can then shadow of course.

@bushrat011899
Copy link
Contributor Author

No need to copy over the inherent methods!

The issue is cases where a user wants to use the typename to access a method:

let map = HashMap::new();

Since I have to add forwarding implementations for some methods, I just grabbed the entire public API.

let layer_map = pointer_over_map
.entry(pointer)
.or_insert_with(BTreeMap::new);
let layer_map = pointer_over_map.entry(pointer).or_default();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated CI failure

let spawned = self
.spawned_dynamic_scenes
.entry(handle.id())
.or_insert_with(HashSet::default);
let spawned = self.spawned_dynamic_scenes.entry(handle.id()).or_default();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated CI failure

@alice-i-cecile alice-i-cecile added S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Apr 3, 2025
@alice-i-cecile
Copy link
Member

Good implementation and I agree with your choices but needs better docs :)

It's annoying to have to do this but this is effectively a one time cost.

@bushrat011899
Copy link
Contributor Author

Good implementation and I agree with your choices but needs better docs :)

Yeah since this is "Bevy's" hashmap/set implementation it does deserve better documentation. I've gone through and added at a minimum links to the underlying hashbrown documentation, and examples to most methods.

It's annoying to have to do this but this is effectively a one time cost.

Agreed. Plus we insulate ourselves from further API changes pretty strongly this way.

@bushrat011899
Copy link
Contributor Author

It looks like your PR is a breaking change, but you didn't provide a migration guide.

I think because this PR is intended for 0.16 I should omit the migration guide in the repo? Happy to add it there if anyone disagrees tho!

@Victoronz
Copy link
Contributor

Victoronz commented Apr 4, 2025

To mitigate the breaking changes fully, there should also be this impl for the wrapped HashSet.

And for completeness, it is nice to have into_inner methods for the wrappers, which are an explicit, non-inference dependent convention to get the owned, unwrapped type. (It can also be constified, but here that doesn't matter much)

Copy link
Contributor

@djeedai djeedai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving on the basis that this looks to do the right thing and fix the migration issue. Thanks for the (tedious) work!

Just to note, I can't help but having that feeling that if the Rust std team didn't make HashMap::new() use the hasher from the generic parameter, and instead force it to RandomState, there's a reason for this choice. Is that a language limitation? Is there an edge case we're not seeing? I'd feel more comfortable with this change if we had an answer to that question first and understood the original design choice.

}
}

impl<K, V, S> Deref for HashMap<K, V, S> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious here, can't we use a derive? Why do we have to manually impl this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Purely because this crate doesn't take a dependency on any crates that provide that derive macro. Simple enough to write so I'd rather avoid possible compile time changes.

@janhohenheim janhohenheim added S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Apr 4, 2025
@bushrat011899
Copy link
Contributor Author

Yeah, those are the associated functions we have to copy (specifically the functions that do not have a self parameter). However, for the methods (= associated functions that have a self parameter), the deref impl covers them entirely, and they're superfluous! It is important to omit the inherent methods that do not have any significance, because otherwise it is more difficult to find the ones that do. They will appear twice in the generated doc page, once normally, and once under the deref, which is confusing when there is no difference to be found! Lastly, less lines, less maintenance/reviewer load!

I attempted to remove these methods (leaving only the associated functions) but there were several compilation issues I was unable to resolve. Some were due to using Self::len in bevy_reflect, others were caused by drain not being picked up correctly, and I'm sure there are more. Because of that I'm leaving the methods as-is, but would encourage a follow-up to remove these methods where possible if that's still desirable.

To mitigate the breaking changes fully, there should also be this impl for the wrapped HashSet.

And for completeness, it is nice to have into_inner methods for the wrappers, which are an explicit, non-inference dependent convention to get the owned, unwrapped type. (It can also be constified, but here that doesn't matter much)

Agreed! I've added both of these.

@Victoronz
Copy link
Contributor

Victoronz commented Apr 5, 2025

Yeah, those are the associated functions we have to copy (specifically the functions that do not have a self parameter). However, for the methods (= associated functions that have a self parameter), the deref impl covers them entirely, and they're superfluous! It is important to omit the inherent methods that do not have any significance, because otherwise it is more difficult to find the ones that do. They will appear twice in the generated doc page, once normally, and once under the deref, which is confusing when there is no difference to be found! Lastly, less lines, less maintenance/reviewer load!

I attempted to remove these methods (leaving only the associated functions) but there were several compilation issues I was unable to resolve. Some were due to using Self::len in bevy_reflect, others were caused by drain not being picked up correctly, and I'm sure there are more. Because of that I'm leaving the methods as-is, but would encourage a follow-up to remove these methods where possible if that's still desirable.

I see, but these cases are actually useful to know! Because they should appear in the migration guide for #16912.
From your example, "Uses of Self::method in trait impls should be replaced with a deref and a method call" would be my guess.
I'm guessing that this failure is in impl_reflect_for_hash_set/map where Self::method would see use.
The direct migration would then look like this: self.deref().len() or (*self).len()

It'd be nice to know how exactly other things broke.

@bushrat011899
Copy link
Contributor Author

I see, but these cases are actually useful to know! Because they should appear in the migration guide for #16912.
From your example, "Uses of Self::method in trait impls should be replaced with a deref and a method call" would be my guess.

Yeah that's definitely the solution, but the goal here is to minimise the work in migrating to 0.16. Remembering that the status quo without this PR is to "just" replace new with either default or with_hasher (and likewise for with_capacity). I'd argue this PR would be a net negative if we made the new syntax nicer but then broke a while heap of other places (with Bevy's own codebase being a canary for the kinds of issues).

I'm guessing that this failure is in impl_reflect_for_hash_set/map where Self::method would see use.
The direct migration would then look like this: self.deref().len() or (*self).len()

Yeah I reckon that'd be the fix.

It'd be nice to know how exactly other things broke.

Yeah I'm still unsure how drain got broken, or if that was some spurious error caused by other issues like Self::len. Regardless, the goal of this PR is to minimise migration issues, so whatever path makes the migration guide smallest would be my preference.

@Victoronz
Copy link
Contributor

Victoronz commented Apr 5, 2025

I see, but these cases are actually useful to know! Because they should appear in the migration guide for #16912.
From your example, "Uses of Self::method in trait impls should be replaced with a deref and a method call" would be my guess.

Yeah that's definitely the solution, but the goal here is to minimise the work in migrating to 0.16. Remembering that the status quo without this PR is to "just" replace new with either default or with_hasher (and likewise for with_capacity). I'd argue this PR would be a net negative if we made the new syntax nicer but then broke a while heap of other places (with Bevy's own codebase being a canary for the kinds of issues).

I'm guessing that this failure is in impl_reflect_for_hash_set/map where Self::method would see use.
The direct migration would then look like this: self.deref().len() or (*self).len()

Yeah I reckon that'd be the fix.

It'd be nice to know how exactly other things broke.

Yeah I'm still unsure how drain got broken, or if that was some spurious error caused by other issues like Self::len. Regardless, the goal of this PR is to minimise migration issues, so whatever path makes the migration guide smallest would be my preference.

We might disagree here! I see the breaking changes to construction concerning new/From as a magnitude worse than needing to rewrite Self::method.
Inserting a manual deref for a method call is something that rarely needs to be done, here this is the case because we're implementing a trait method with a colliding name, which seldomly happens downstream.
In contrast, every person reading the docs of HashMap will come across the duplication, which I'd deem to be a net-worse experience.

Additionally, unless we replicate every feature-specific impl, f.e. impls gated under the rayon feature, it is still a breaking change in this respect + the missing gated trait impls.
(There are nightly feature impls too, but whether that counts as a breaking change I'd need to read up on).

If that's fine, then instead of removing these methods, we could also make a note of why the duplication exists/where the line was drawn and leave it at that.
If we leave them be, then we should also add in the missing ones:

  • HashSet::insert_unique_unchecked
  • HashMap::{insert_unique_unchecked, get_many_unchecked_mut, get_many_key_value_unchecked_mut}

@bushrat011899
Copy link
Contributor Author

We might disagree here! I see the breaking changes to construction concerning new/From as a magnitude worse than needing to rewrite Self::method. Inserting a manual deref for a method call is something that rarely needs to be done, here this is the case because we're implementing a trait method with a colliding name, which seldomly happens downstream.

Yeah I agree! I totally appreciate the position you're taking here, and I suspect our disagreement does just come down to which compromises each of us personally prefers.

In contrast, every person reading the docs of HashMap will come across the duplication, which I'd deem to be a net-worse experience.

This is unfortunate, but I checked the docs page with this PR, and the deref methods are separated into their own section (in the sidebar and the content of the page). Within Visual Studio Code, the duplicate methods aren't suggested by rust-analyser, so the only time you'll see these methods are when you have the proper context on docs.rs.

Additionally, unless we replicate every feature-specific impl, f.e. impls gated under the rayon feature, it is still a breaking change in this respect + the missing gated trait impls.

Yeah when you put it that way I'm convinced. I'll add rayon as a feature (unused by Bevy so no changes really), so the breaking change will just be requiring users to enable bevy_platform_support/rayon if they were using hashbrown/rayon.

There are nightly feature impls too, but whether that counts as a breaking change I'd need to read up on.

Yeah I drew a line here since Bevy otherwise has no support for these APIs (the existing type aliases had no allocator parameter, forcing users to rely on hashbrown directly anyway).

If that's fine, then instead of removing these methods, we could also make a note of why the duplication exists/where the line was drawn and leave it at that.

If we leave them be, then we should also add in the missing ones:

* `HashSet::insert_unique_unchecked`

* `HashMap::{insert_unique_unchecked, get_many_unchecked_mut, get_many_key_value_unchecked_mut}`

Agreed, I've added those methods for completeness.

@alice-i-cecile alice-i-cecile added X-Contentious There are nontrivial implications that should be thought through and removed X-Controversial There is active debate or serious implications around merging this PR labels Apr 6, 2025
@alice-i-cecile
Copy link
Member

I'm convinced we've found a solution with few enough trade-offs that I'm comfortable moving forward as-is. Merging :)

@alice-i-cecile alice-i-cecile added this pull request to the merge queue Apr 6, 2025
Merged via the queue into bevyengine:main with commit 7892f9e Apr 6, 2025
38 checks passed
mockersf pushed a commit that referenced this pull request Apr 8, 2025
# Objective

- Fixes #18690
- Closes [#2065](bevyengine/bevy-website#2065)
- Alternative to #18691

The changes to the Hash made in #15801 to the
[BuildHasher](https://doc.rust-lang.org/std/hash/trait.BuildHasher.html)
resulted in serious migration problems and downgraded UX for users of
Bevy's re-exported hashmaps. Once merged, we need to go in and remove
the migration guide added as part of #15801.

## Solution

- Newtype `HashMap` and `HashSet` instead of type aliases
- Added `Deref/Mut` to allow accessing future `hashbrown` methods
without maintenance from Bevy
- Added bidirectional `From` implementations to provide escape hatch for
API incompatibility
- Added inlinable re-exports of all methods directly to Bevy's types.
This ensures `HashMap::new()` works (since the `Deref` implementation
wont cover these kinds of invocations).

## Testing

- CI

---

## Migration Guide

- If you relied on Bevy's `HashMap` and/or `HashSet` types to be
identical to `hashbrown`, consider using `From` and `Into` to convert
between the `hashbrown` and Bevy types as required.
- If you relied on `hashbrown/serde` or `hashbrown/rayon` features, you
may need to enable `bevy_platform_support/serialize` and/or
`bevy_platform_support/rayon` respectively.

---

## Notes

- Did not replicate the Rayon traits, users will need to rely on the
`Deref/Mut` or `From` implementations for those methods.
- Did not re-expose the `unsafe` methods from `hashbrown`. In most cases
users will still have access via `Deref/Mut` anyway.
- I have added `inline` to all methods as they are trivial wrappings of
existing methods.
- I chose to make `HashMap::new` and `HashSet::new` const, which is
different to `hashbrown`. We can do this because we default to a
fixed-state build-hasher. Mild ergonomic win over using
`HashMap::with_hasher(FixedHasher)`.
Shatur added a commit to projectharmonia/bevy_replicon that referenced this pull request Apr 19, 2025
During the migration to 0.16, I added a few workarounds related to `hashbrown` to make the code compile.
These issues have been fixed in the latest Bevy RC: bevyengine/bevy#18694
So we can now remove those workarounds.
Shatur added a commit to projectharmonia/bevy_replicon that referenced this pull request Apr 22, 2025
During the migration to 0.16, I added a few workarounds related to `hashbrown` to make the code compile.
These issues have been fixed in the latest Bevy RC: bevyengine/bevy#18694
So we can now remove those workarounds.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Utils Utility functions and types C-Code-Quality A section of code that is hard to understand or change D-Modest A "normal" level of difficulty; suitable for simple features or challenging fixes M-Needs-Migration-Guide A breaking change to Bevy's public API that needs to be noted in a migration guide P-Regression Functionality that used to work but no longer does. Add a test for this! S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it X-Contentious There are nontrivial implications that should be thought through
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mismatched types and inaccessible methods for HashMap and HashSet in 0.16-rc
5 participants