Skip to content

Mismatched types and inaccessible methods for HashMap and HashSet in 0.16-rc #18690

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
alice-i-cecile opened this issue Apr 2, 2025 · 10 comments · Fixed by #18694
Closed

Mismatched types and inaccessible methods for HashMap and HashSet in 0.16-rc #18690

alice-i-cecile opened this issue Apr 2, 2025 · 10 comments · Fixed by #18694
Labels
A-Utils Utility functions and types C-Dependencies A change to the crates that Bevy depends on C-Usability A targeted quality-of-life change that makes Bevy easier to use P-Regression Functionality that used to work but no longer does. Add a test for this! X-Controversial There is active debate or serious implications around merging this PR
Milestone

Comments

@alice-i-cecile
Copy link
Member

Background

rust-lang/hashbrown#563 changed the default hasher for hashbrown to one that doesn't meet our needs. #15801 updated us to that version of hashbrown, following our dependencies. We were originally using hashbrown::HashMap over std::HashMap due to the use of a faster, more vulnerable hashing algorithm, as Bevy apps are generally not exposed to malicious input in this way.

#17460 shuffled our types around, moving HashMap and HashSet into bevy_platform_support, as they replace functionality provided by Rust's std`.

As part of #15801, we also changed hashers.

#18350 attempts to explain this problem somewhat in the docs.

Problems

As discussed in bevyengine/bevy-website#2065, the migration for this is terrible: it's more explicit in confusing ways, there are prominent methods like new and with_capacity that are simply unusable, despite showing up in the docs!

Potential solutions

Use hashbrown's default hasher

This might or might not be slower than whatever we're using. It fixes the migration / UX issues, and avoids duplicate dependencies though!

The really critical stuff is already generally using EntityHashMap etc, which is very optimized. See #17078 for an example.

Downgrade to hashbrown 0.14

This gets us parity with Bevy 0.15, but duplicates the dependency in our tree.

It also doesn't allow us to use the deterministic FixedHash (or whatever else we may choose) by default.

Petition for a better API upstream that can handle non-standard hashers with generics

This would be great, but due to the need to specify generic type defaults when calling methods, it may not be readily accepted.

This is made much worse by the fact that hashbrown is used in and aligns with the std API for these types, which has very strict stability guarantees.

Fork hashbrown and change the default hasher

This also duplicates the code, but allows us to properly fix the issue and pull in new fixes and improvements from upstream.

This is strictly better than relying on 0.14 forever, but adds maintenance burden for a complex utilities crate.

@alice-i-cecile alice-i-cecile added C-Usability A targeted quality-of-life change that makes Bevy easier to use P-Regression Functionality that used to work but no longer does. Add a test for this! A-Utils Utility functions and types labels Apr 2, 2025
@alice-i-cecile alice-i-cecile added this to the 0.16 milestone Apr 2, 2025
@bushrat011899
Copy link
Contributor

Slight variant on forking hashbrown would be to wrap hashbrown in a new-type, forwarding trait implementations to hashbrown and offering From for conversion and Deref/Mut to avoid recreating all methods from scratch.

@alice-i-cecile
Copy link
Member Author

alice-i-cecile commented Apr 2, 2025

I'm in favor of "use hashbrown's default hash" as the Bevy 0.16 solution, but would accept "just revert to hashbrown 0.14", reverting #15801. I feel strongly that we shouldn't ship in the current state.

Long-term, the ideal solution is clearly "hasbrown / std make it easier to use alternate hashers", but I would grudgingly tolerate forking if that seems impossible. Newtyping hashbrown's types would work, but seems similarly miserable to forking, and I'm not convinced that all of the possible functionality can be reproduced using that approach.

@Amanieu, do you have opinions / insight here? You're much better acquainted with the various algorithms and the upstream maintenance considerations here.

@alice-i-cecile
Copy link
Member Author

As an aside: I didn't realize the magnitude of UX issues here, and shouldn't have merged #15801 so readily. Dependency bumps are usually much less frustrating than this!

@alice-i-cecile alice-i-cecile added C-Dependencies A change to the crates that Bevy depends on X-Controversial There is active debate or serious implications around merging this PR labels Apr 2, 2025
@SpecificProtagonist
Copy link
Contributor

Use hashbrown's default hasher

We do use hashbrown's default hasher, just not its default buildhasher.

@Amanieu
Copy link

Amanieu commented Apr 2, 2025

Hashbrown switched from ahash to foldhash because the latter is faster overall (see benchmarks in https://github.com/orlp/foldhash/). The HashMap::new method is only implemented when using hashbrown's default hasher and this has always been the case. My recommendation is that you stick to hashbrown's default hasher instead of explicitly specifying AHasher if you want to keep using HashMap::new.

@alice-i-cecile
Copy link
Member Author

alice-i-cecile commented Apr 2, 2025

Can you explain the difference between a hasher and a buildhasher for me? Why shouldn't we use the default buildhasher?

Oh I see: that's the random vs fixed initialization. Bah, okay. We can sacrifice determinism here in favor of ergonomics by default IMO.

I really appreciate you chiming in here Amaneiu.

@Amanieu
Copy link

Amanieu commented Apr 2, 2025

Also keep in mind that hashbrown may change the default hasher again in the future. You should not assume a specific hasher is used and should instead always use hashbrown::DefaultHashBuilder.

@Victoronz
Copy link
Contributor

Was there public discussion where this restriction for HashMap::new was decided on? I haven't been able to find any so far.
The Rust API Guidelines recommend that the Default impl and the new constructor should match, so I'm curious on why a different decision was made here!

(Returning to DefaultHashBuilder for 0.16 seems sensible to me as well)

@Amanieu
Copy link

Amanieu commented Apr 2, 2025

This restriction also exists in the standard library (since Rust 1.0). If HashMap::new were implemented for all hashers then the compiler would often fail to infer H. It was deemed more ergonomic to only implement HashMap::new for the default hasher and require other hashers to use HashMap::with_hasher.

@Victoronz
Copy link
Contributor

Makes sense!

github-merge-queue bot pushed a commit that referenced this issue Apr 6, 2025
# Objective

- Fixes #18690
- Closes [#2065](bevyengine/bevy-website#2065)
- Alternative to #18691

The changes to the Hash made in #15801 to the
[BuildHasher](https://doc.rust-lang.org/std/hash/trait.BuildHasher.html)
resulted in serious migration problems and downgraded UX for users of
Bevy's re-exported hashmaps. Once merged, we need to go in and remove
the migration guide added as part of #15801.

## Solution

- Newtype `HashMap` and `HashSet` instead of type aliases
- Added `Deref/Mut` to allow accessing future `hashbrown` methods
without maintenance from Bevy
- Added bidirectional `From` implementations to provide escape hatch for
API incompatibility
- Added inlinable re-exports of all methods directly to Bevy's types.
This ensures `HashMap::new()` works (since the `Deref` implementation
wont cover these kinds of invocations).

## Testing

- CI

---

## Migration Guide

- If you relied on Bevy's `HashMap` and/or `HashSet` types to be
identical to `hashbrown`, consider using `From` and `Into` to convert
between the `hashbrown` and Bevy types as required.
- If you relied on `hashbrown/serde` or `hashbrown/rayon` features, you
may need to enable `bevy_platform_support/serialize` and/or
`bevy_platform_support/rayon` respectively.

---

## Notes

- Did not replicate the Rayon traits, users will need to rely on the
`Deref/Mut` or `From` implementations for those methods.
- Did not re-expose the `unsafe` methods from `hashbrown`. In most cases
users will still have access via `Deref/Mut` anyway.
- I have added `inline` to all methods as they are trivial wrappings of
existing methods.
- I chose to make `HashMap::new` and `HashSet::new` const, which is
different to `hashbrown`. We can do this because we default to a
fixed-state build-hasher. Mild ergonomic win over using
`HashMap::with_hasher(FixedHasher)`.
mockersf pushed a commit that referenced this issue Apr 8, 2025
# Objective

- Fixes #18690
- Closes [#2065](bevyengine/bevy-website#2065)
- Alternative to #18691

The changes to the Hash made in #15801 to the
[BuildHasher](https://doc.rust-lang.org/std/hash/trait.BuildHasher.html)
resulted in serious migration problems and downgraded UX for users of
Bevy's re-exported hashmaps. Once merged, we need to go in and remove
the migration guide added as part of #15801.

## Solution

- Newtype `HashMap` and `HashSet` instead of type aliases
- Added `Deref/Mut` to allow accessing future `hashbrown` methods
without maintenance from Bevy
- Added bidirectional `From` implementations to provide escape hatch for
API incompatibility
- Added inlinable re-exports of all methods directly to Bevy's types.
This ensures `HashMap::new()` works (since the `Deref` implementation
wont cover these kinds of invocations).

## Testing

- CI

---

## Migration Guide

- If you relied on Bevy's `HashMap` and/or `HashSet` types to be
identical to `hashbrown`, consider using `From` and `Into` to convert
between the `hashbrown` and Bevy types as required.
- If you relied on `hashbrown/serde` or `hashbrown/rayon` features, you
may need to enable `bevy_platform_support/serialize` and/or
`bevy_platform_support/rayon` respectively.

---

## Notes

- Did not replicate the Rayon traits, users will need to rely on the
`Deref/Mut` or `From` implementations for those methods.
- Did not re-expose the `unsafe` methods from `hashbrown`. In most cases
users will still have access via `Deref/Mut` anyway.
- I have added `inline` to all methods as they are trivial wrappings of
existing methods.
- I chose to make `HashMap::new` and `HashSet::new` const, which is
different to `hashbrown`. We can do this because we default to a
fixed-state build-hasher. Mild ergonomic win over using
`HashMap::with_hasher(FixedHasher)`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Utils Utility functions and types C-Dependencies A change to the crates that Bevy depends on C-Usability A targeted quality-of-life change that makes Bevy easier to use P-Regression Functionality that used to work but no longer does. Add a test for this! X-Controversial There is active debate or serious implications around merging this PR
Projects
None yet
5 participants