Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyperchunk2d #310

Merged
merged 6 commits into from
Feb 16, 2024
Merged

Hyperchunk2d #310

merged 6 commits into from
Feb 16, 2024

Conversation

jreadey
Copy link
Member

@jreadey jreadey commented Feb 14, 2024

This PR adds hyperchunking support for multi-dimensional datasets.

hsds/dset_lib.py Outdated
arr_points[idx] = pt
else:
kwargs = {"dim": dim + 1, "chunk_index": chunk_index, "factors": factors}
next_index = arr_index + (i * np.prod(factors[1:]))
Copy link
Contributor

@mattjala mattjala Feb 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this use np.prod(factors[i:]) to account for all remaining dimensions? This might not matter right now since hyperchunking is limited to two dimensions at most, but it would be easier to understand and future-proof

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is a bug that shows up with more than two dimensions. Fixed this and add a 3D test. Should work for any number of dimensions now

@@ -20,6 +21,37 @@ def _chunk_start(c):
return start


def getHyperChunkFactors(chunk_dims, hyper_dims):
""" return list of rations betwen chunk and hyperchunkdims """
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rations -> ratios

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -66,6 +67,41 @@ def getFillValue(dset_json):
return arr


def _get_arr_pts(arr_points, arr_index, pt, dim=0, chunk_index=None, factors=None):
""" recursive function to fill in chunk locations for hyperchunk selection.
arr_points: numpy array of shape (num_chunks, rank)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be more clear to reword this to something like Recursive function which fills arr_points with the coordinates to locate a hyperchunk selection's corresponding chunks in a chunk table.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds better. Updated

hsds/chunk_dn.py Outdated
else:
hyper_index = param_hyper_index
log.debug(f"hyper_index: {hyper_index}")
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

mattjala
mattjala previously approved these changes Feb 15, 2024
Copy link
Contributor

@mattjala mattjala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The integration test failures are due to #311

@jreadey jreadey merged commit 6c85261 into master Feb 16, 2024
9 of 10 checks passed
@jreadey jreadey deleted the hyperchunk2d branch April 29, 2024 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants