Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: binary_numeric is now correct (+ tests!) #1721

Merged
merged 12 commits into from
Jan 6, 2025
Merged

Conversation

danking
Copy link
Member

@danking danking commented Dec 18, 2024

Lessons learned: do not merge without at least basic tests.

Four major changes:

  1. Fix arithmetic operations to be correct (vis-a-vis lengths and commuting).
  2. Basic test for binary_numeric, parameterized by an array (following the approach described in Compute function test harness #1719).
  3. Add FlippedSub and FlippedDiv to support checking if the RHS supports the given operation.
  4. Ensure the binary numeric implementation is always in binary_numeric.rs

We check for a constant array but do not create a new constant array with the appropriate length for the child on which we want to delegate the binary numeric operation. That is now fixed.

I also added a very basic test suite for applying the (now six) numeric operations on pairs of arrays where one of the two are constant.

In order to properly support constant arrays on the left-hand-side, I added FlippedSub and FlippedDiv which allow us to commute/flip the operator so that we need not teach ConstantArray to check if its right-hand-side supports an operation on constants.

Lessons learned: do not merge without at least basic tests.

We check for a constant array but do not create a new constant array
with the appropriate length for the child on which we want to delegate
the binary numeric operation. That is now fixed.

I also added a very basic test suite for applying the (now six)
numeric operations on pairs of arrays where one of the two
are constant.

In order to properly support constant arrays on the left-
hand-side, I added FlippedSub and FlippedDiv which allow
us to commute/flip the operator so that we need not teach
ConstantArray to check if its right-hand-side supports
an operation on constants.
@danking danking requested review from lwwmanning and gatesn December 18, 2024 21:51
@danking danking marked this pull request as ready for review December 18, 2024 21:51
@danking danking enabled auto-merge (squash) December 19, 2024 02:33
@danking danking requested a review from gatesn January 2, 2025 17:52
Copy link
Contributor

@gatesn gatesn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small changes, then let's ship it

@@ -30,6 +30,7 @@ workspace = true
[dev-dependencies]
criterion = { workspace = true }
rand = { workspace = true }
vortex-array = { workspace = true, features = ["test_util"] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I liked the test-harness! "util" is forbidden!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed the extant test_util to test_harness and adjusted all of this PR's features to test_harness.

@@ -138,7 +142,7 @@ pub fn binary_numeric(

// Check if RHS supports the operation directly.
if let Some(fun) = rhs.encoding().binary_numeric_fn() {
if let Some(result) = fun.binary_numeric(rhs, lhs, op)? {
if let Some(result) = fun.binary_numeric(rhs, lhs, op.flip_parameters())? {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be consistent with naming for CompareFn, in this case op.swap()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -186,13 +190,114 @@ fn arrow_numeric(
let array = match operator {
BinaryNumericOperator::Add => arrow_arith::numeric::add(&lhs, &rhs)?,
BinaryNumericOperator::Sub => arrow_arith::numeric::sub(&lhs, &rhs)?,
BinaryNumericOperator::Div => arrow_arith::numeric::div(&lhs, &rhs)?,
BinaryNumericOperator::FlippedSub => arrow_arith::numeric::sub(&rhs, &lhs)?,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm convinced enough that we should keep these. But can we call it RSub RDiv (following Python)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -25,7 +27,7 @@ pub use take::{take, TakeFn};

use crate::ArrayData;

mod binary_numeric;
pub mod binary_numeric;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not make this pub

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I added a pub mod test_harness to this file and pub used test_binary_numeric in there instead of conditionally pub useing it in the regular module.

@@ -30,6 +30,7 @@ workspace = true
[dev-dependencies]
criterion = { workspace = true }
rand = { workspace = true }
vortex-array = { workspace = true, features = ["test_harness"] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think features are typically kebab case? Might be wrong, can change this later though, don't block the PR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

(Seemed slightly shitty that test-harness gave you a module named test_harness 🤷. @gatesn Should I fix canonical_counter in a separate PR?)

}
}

pub fn math_symbol(&self) -> String {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this used for? If it's for display, it should just be the Display trait. But I don't think we should do this, since it feels like a foot-gun when printing, e.g. format!("{lhs} {op} {rhs}", ...) would be wrong for RDiv and Sub.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Too clever. Nixed.

match self {
BinaryNumericOperator::Add => "+",
BinaryNumericOperator::Sub => "-",
BinaryNumericOperator::RSub => "+",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this seems not quite right

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gone.

Copy link
Member Author

@danking danking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gatesn FWIW, this PR now has quite a few renames of test_util or test_utils to test-harness or test_harness (as appropriate for feature vs module). Can you verify this was what you had in mind?

I also fixed all the dev-dependencies to use workspace=true rather than path.

@@ -30,6 +30,7 @@ workspace = true
[dev-dependencies]
criterion = { workspace = true }
rand = { workspace = true }
vortex-array = { workspace = true, features = ["test_harness"] }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

(Seemed slightly shitty that test-harness gave you a module named test_harness 🤷. @gatesn Should I fix canonical_counter in a separate PR?)

}
}

pub fn math_symbol(&self) -> String {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Too clever. Nixed.

match self {
BinaryNumericOperator::Add => "+",
BinaryNumericOperator::Sub => "-",
BinaryNumericOperator::RSub => "+",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gone.

@danking danking merged commit 536a43f into develop Jan 6, 2025
21 checks passed
@danking danking deleted the dk/fix-arithmetic branch January 6, 2025 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants