Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closes-4096, adding range parameter to histogram functions #4078

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

vasslitvinov
Copy link
Contributor

@vasslitvinov vasslitvinov commented Feb 5, 2025

Closes #4096. Required by Bears-R-Us/arkouda-contrib#179

Adds numpy-style optional range parameter to client histogram functions:

  • histogram()
  • histogram2d()
  • histogramdd()

Switches the histograms produced by histogramdd() to contain float counts instead of integer counts, to match numpy.

While there:

  • Makes histogramdDMsg() more efficient.
  • Factors out some histogram server code.
  • Removes the dependence of histogramdd() on cumprod().
  • Cleans up Makefile to remove an extraneous / and trailing whitespace.

@@ -1996,17 +2064,31 @@ def histogramdd(
if any(b < 1 for b in bins):
raise ValueError("bins must be 1 or greater")

if not range:
range = [None for pda in sample]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right here you allow range to contain the value None.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I resolved this by introducing a different variable, range2, instead of reusing range for different things.

BTW it turns out the same trick works in the other case with xMin et al. so I applied it there as well by introducing xMin0 et al.

bin_boundaries = [linspace(a.min(), a.max(), b + 1) for a, b in zip(sample, bins)]
bins_pda = array(bins)[::-1]
dim_prod = (cumprod(bins_pda) // bins_pda)[::-1]
bin_boundaries = [linspace(r[0], r[1], b + 1) for r, b in zip(range, bins)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But down here, the value None is not indexable as r[0] and r[1].

@vasslitvinov vasslitvinov added the numpy alignment Align Arkouda API to the Numpy API label Feb 12, 2025
@vasslitvinov vasslitvinov changed the title Add range parameter to histogram functions Closes-4096, adding range parameter to histogram functions Feb 12, 2025
Signed-off-by: Vassily Litvinov <[email protected]>
Signed-off-by: Vassily Litvinov <[email protected]>
Signed-off-by: Vassily Litvinov <[email protected]>
Signed-off-by: Vassily Litvinov <[email protected]>
Signed-off-by: Vassily Litvinov <[email protected]>
Signed-off-by: Vassily Litvinov <[email protected]>
Signed-off-by: Vassily Litvinov <[email protected]>
Signed-off-by: Vassily Litvinov <[email protected]>
Signed-off-by: Vassily Litvinov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
numpy alignment Align Arkouda API to the Numpy API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add range parameter to histogram functions
2 participants