Skip to content

Derive AggregateUDFImpl equality, hash from Eq, Hash traits #17130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 12, 2025

Conversation

findepi
Copy link
Member

@findepi findepi commented Aug 11, 2025

Follows similar change for WindowUDFImpl, i.e. the 8494a39 commit.

Previously, the AggregateUDFImpl trait contained equals and hash_value methods with contracts following the Eq and Hash traits. However, the existence of default implementations of these methods made it error-prone, with many functions (scalar, aggregate, window) missing to customize the equals even though they ought to. There is no fix to this that's not an API breaking change, so a breaking change is warranted.

Removing the default implementations would be enough of a solution, but at the cost of a lot of boilerplate needed in implementations.

Instead, this removes the methods from the trait, and reuses DynEq, DynHash traits used previously only for physical expressions. This allows for functions to provide their implementations using no more than #[derive(PartialEq, Eq, Hash)] in a typical case.

@github-actions github-actions bot added documentation Improvements or additions to documentation logical-expr Logical plan and expressions optimizer Optimizer rules core Core DataFusion crate proto Related to proto crate functions Changes to functions implementation ffi Changes to the ffi crate labels Aug 11, 2025
@findepi findepi force-pushed the findepi/udaf-perfect-eq branch from fb676be to 8bf3c5d Compare August 11, 2025 19:51
Follows similar change for `WindowUDFImpl`, i.e. the
8494a39 commit.

Previously, the `AggregateUDFImpl` trait contained `equals` and
`hash_value` methods with contracts following the `Eq` and `Hash`
traits.  However, the existence of default implementations of these
methods made it error-prone, with many functions (scalar, aggregate,
window) missing to customize the equals even though they ought to.
There is no fix to this that's not an API breaking change, so a breaking
change is warranted.

Removing the default implementations would be enough of a solution, but
at the cost of a lot of boilerplate needed in implementations.

Instead, this removes the methods from the trait, and reuses `DynEq`,
`DynHash` traits used previously only for physical expressions. This
allows for functions to provide their implementations using no more than
`#[derive(PartialEq, Eq, Hash)]` in a typical case.
@findepi findepi force-pushed the findepi/udaf-perfect-eq branch from 8bf3c5d to 0671ef6 Compare August 11, 2025 19:59
@github-actions github-actions bot added the physical-expr Changes to the physical-expr crates label Aug 11, 2025
Copy link
Contributor

@timsaucer timsaucer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since all of the previous work has already merged, this PR is very simple and easy to follow. Thank you for all the work on this!

@alamb
Copy link
Contributor

alamb commented Aug 12, 2025

Looks good to me too -- thanks @findepi and @timsaucer

@findepi findepi merged commit b8bf7c5 into apache:main Aug 12, 2025
28 checks passed
@findepi findepi deleted the findepi/udaf-perfect-eq branch August 12, 2025 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate documentation Improvements or additions to documentation ffi Changes to the ffi crate functions Changes to functions implementation logical-expr Logical plan and expressions optimizer Optimizer rules physical-expr Changes to the physical-expr crates proto Related to proto crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace AggregateUDFImpl::{equals,hash_value} with UdfHash, UdfEq traits Implement PartialEq, Hash for all UDAFs (AggregateUDFImpl)
3 participants