Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_merge_chunk_array could allocate memory more carefully #2038

Open
rabernat opened this issue Jul 14, 2024 · 5 comments
Open

_merge_chunk_array could allocate memory more carefully #2038

rabernat opened this issue Jul 14, 2024 · 5 comments

Comments

@rabernat
Copy link
Contributor

rabernat commented Jul 14, 2024

I was reviewing the Pipeline and has some observations. A key function is _merge_chunk_array

def _merge_chunk_array(
self,
existing_chunk_array: NDBuffer | None,
value: NDBuffer,
out_selection: SelectorTuple,
chunk_spec: ArraySpec,
chunk_selection: SelectorTuple,
drop_axes: tuple[int, ...],
) -> NDBuffer:
if is_total_slice(chunk_selection, chunk_spec.shape) and value.shape == chunk_spec.shape:
return value
if existing_chunk_array is None:
chunk_array = chunk_spec.prototype.nd_buffer.create(
shape=chunk_spec.shape,
dtype=chunk_spec.dtype,
order=chunk_spec.order,
fill_value=chunk_spec.fill_value,
)
else:
chunk_array = existing_chunk_array.copy() # make a writable copy
if chunk_selection == () or is_scalar(value.as_ndarray_like(), chunk_spec.dtype):
chunk_value = value
else:
chunk_value = value[out_selection]
# handle missing singleton dimensions
if drop_axes != ():
item = tuple(
None # equivalent to np.newaxis
if idx in drop_axes
else slice(None)
for idx in range(chunk_spec.ndim)
)
chunk_value = chunk_value[item]
chunk_array[chunk_selection] = chunk_value
return chunk_array

There are two issues with this implementation

  1. if existing_chunk_array is None: this code path always allocates a new empty array and then copies data from the original selection into it? Why? This seems like a great opportunity to avoid and extra memory copy. Can't we just use a view of the original data (values) when appropriate? This happens e.g. when writing to chunks which are smaller than the selection
  2. The same code path always creates data of shape chunk_spec.shape, even for the last chunk of an array, which might be much smaller than the chunk shape.

An example which highlights both of these inefficiencies is as follows:

import numpy as np
import zarr
zarr.array(np.ones(101), chunks=(100,))
@rabernat
Copy link
Contributor Author

Related: this is_total_slice check will generally fail for the last chunk because the last chunk selection will not be the same size as chunk_spec.shape.

None if is_total_slice(chunk_selection, chunk_spec.shape) else byte_setter,
chunk_spec.prototype,

When writing, this will trigger unnecessary trips to the store to read chunks, even though the data will be completely overwritten.

@d-v-b
Copy link
Contributor

d-v-b commented Jul 15, 2024

good catch @rabernat, I don't know this code very well but have a few high-level additions

  • is_total_slice has been a source of problems / confusion in the past. I suspect we need aggressive testing of this routine.
  • I wish we had performance tooling that could show me the memory / IO load associated with your examples. There must be something in the python space we could use for this.

@rabernat
Copy link
Contributor Author

Tracemalloc should allow some tracing of allocations.

A good target would be the following: if I create a memorystore with uncompressed data, I should be able to get this data all the way to a numpy array without copying any bytes.

@dcherian
Copy link
Contributor

Ive used the memory_profiler ipython extension to good effect. More recently scalene is getting a lot of attention

@d-v-b
Copy link
Contributor

d-v-b commented Jul 15, 2024

maybe it would make sense to create a specific hatch environment with useful benchmarking / performance monitoring tools, so we could do stuff like

hatch run perf:<invoke memory monitoring tool>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants