Skip to content

slicing a slice with an array without expanding the slice #10580

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

keewis
Copy link
Collaborator

@keewis keewis commented Jul 28, 2025

The lazy indexing machinery currently uses _index_indexer_1d to keep indexing lazy (for orthogonal indexing only), which tries very hard to combine multiple sequential indexers into a single indexer.

However, it currently uses _expand_slice when indexing a slice with an array, which has the potential to run out of memory for very large dimension sizes of the indexed array.

Instead of materializing, we can just combine the slice (s) with the array (idx) by:

new_idx = idx * s.step + s.start

Interestingly, this works regardless of whether step is negative or positive, but we do have to raise an IndexError if there's any value >= size.

  • Tests added

# shortcut for the usual case
return old_indexer
if isinstance(old_indexer, slice):
if isinstance(applied_indexer, slice):
indexer = slice_slice(old_indexer, applied_indexer, size)
elif isinstance(applied_indexer, integer_types):
indexer = range(*old_indexer.indices(size))[applied_indexer] # type: ignore[assignment]
elif is_full_slice(old_indexer):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this fix point (3) in #10311 (comment)

This is "combined" with the arrayized version of BasicIndexer(slice(None), slice(None), slice(None))

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, because for vectorized indexing we use _combine_indexers, not _index_indexer_1d (but see also my comment in #10311 about whether we can frame that particular case – fancy indexing along a single dimension – as an outer indexer)

Copy link
Contributor

@dcherian dcherian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! A peakmem benchmark would be nice but I think our infra is broken at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants