Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow reducers to be overridden in v2 #1423

Closed
jpivarski opened this issue Apr 15, 2022 · 1 comment · Fixed by #2458
Closed

Allow reducers to be overridden in v2 #1423

jpivarski opened this issue Apr 15, 2022 · 1 comment · Fixed by #2458
Assignees
Labels
feature New feature or request

Comments

@jpivarski
Copy link
Member

Description of new feature

This was the second half of #1375; I'm splitting it because it's really two issues.

Now #1375 is just for applying the policy in v1, which will prevent bugs.

This issue is for extending the capability to what users really want—allowing reducers to be overridden so that ak.sum on Vectors will add the Vectors (accounting for coordinate systems). It can only be done in v2.

@jpivarski jpivarski added the feature New feature or request label Apr 15, 2022
@ioanaif ioanaif self-assigned this Aug 18, 2022
@jpivarski
Copy link
Member Author

Motivation:

import vector

v = array of lists of vectors in pt, phi, eta coordinates

ak.sum(v, axis=1)   # CORRECTLY adds vectors in each list to produce one vector per list

Vector needs to be able to overload record-sum in such a way as to correctly handle non-Cartesian coordinates. On our side, that means that reducers over records (such as vectors) need to be overloadable.

Overloading syntax?

ak.behaviors[np.sum, "Momentum3D"] = special_function   # implemented in vector, not awkward

There's already a syntax for overloading NumPy ufuncs,

ak.behaviors[np.add, "Momentum3D", "Momentum3D"] = special_function

Ufuncs are "mapper" functions; the new capability would be to be able to overload "reducer" functions. The only ones that will likely ever be overloaded are np.sum (very likely) and np.prod (not very likely). Others, like np.any or np.all, probably not, because they usually act on booleans. Maybe np.min and np.max would want overloads for records so that orderings can be given to complex objects. (If someone wants to overload np.min and np.max, they'd want to connect it somehow with the ufuncs np.less_than, np.greater_than, np.minimize, and np.maximize...)

Should it just be records, or also arrays? (__record__: "Momentum3D" and also __array__: "string"?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants