Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement nanpercentile #2187

Open
ashwinvis opened this issue Mar 9, 2024 · 3 comments
Open

Implement nanpercentile #2187

ashwinvis opened this issue Mar 9, 2024 · 3 comments

Comments

@ashwinvis
Copy link
Contributor

ashwinvis commented Mar 9, 2024

Function nanpercentile in numpy can be awfully slow if we use axis=0 when the number of columns are a huge, or vice-versa. This is noted here:

And I have a use-case for this from work.

There are faster JIT-versions of this now in numbaagg, jax etc, but it will be easier to ship something statically compiled á la Pythran. Any pointers to get started?

Possible follow-up easy quick-wins

  • nanquantile(a, q, ...) = nanpercentile(a, q*100, ...)
  • nanmedian(a, ...) = nanpercentile(a, 50, ...)
@serge-sans-paille
Copy link
Owner

Hey @ashwinvis : I'd go with implementatin np.percentile with the default method, and see if we can match numpy's speed. If we can, implementing the nan version would be the next step. That's for the high level view.

For the details, numpy algorithms seems to be based on https://github.com/numpy/numpy/blob/3b246c6488cf246d488bbe5726ca58dc26b6ea74/numpy/lib/_function_base_impl.py#L4830

@ashwinvis
Copy link
Contributor Author

I agree, all functions seem to be based on _quantile. The nan* variants simply requires weeding out NaNs before running its regular variant. A good close study is needed, before I can decide if I (or someone else) can do this or not..

@ashwinvis
Copy link
Contributor Author

@serge-sans-paille I saw that xtensor made a C++ port of quantile and its many variants. Do you think it can be ported using Pythran's pythonic?

https://github.com/xtensor-stack/xtensor/blob/a17f3de23f9bd08ad009539ff9876636ff7612c6/include/xtensor/xsort.hpp#L771

I know that you have commented in the past (#1476) that xtensor's approach is incompatible. I wonder if it is true for this function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants