Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
apacheGH-39984: [Python] Add ChunkedArray import/export to/from C (ap…
…ache#39985) ### Rationale for this change ChunkedArrays have an unambiguous representation as a stream of arrays. apache#39455 added the ability to import/export in C++...this PR wires up the new functions in pyarrow. ### What changes are included in this PR? - Added `__arrow_c_stream__()` and `_import_from_c_capsule()` to the `ChunkedArray` ### Are these changes tested? Yes! Tests were added. ### Are there any user-facing changes? Yes! But I'm not sure where the protocol methods are documented. ```python import pyarrow as pa import nanoarrow as na chunked = pa.chunked_array([pa.array([0, 1, 2]), pa.array([3, 4, 5])]) [na.c_array_view(item) for item in na.c_array_stream(chunked)] ``` [<nanoarrow.c_lib.CArrayView> - storage_type: 'int64' - length: 3 - offset: 0 - null_count: 0 - buffers[2]: - <bool validity[0 b] > - <int64 data[24 b] 0 1 2> - dictionary: NULL - children[0]:, <nanoarrow.c_lib.CArrayView> - storage_type: 'int64' - length: 3 - offset: 0 - null_count: 0 - buffers[2]: - <bool validity[0 b] > - <int64 data[24 b] 3 4 5> - dictionary: NULL - children[0]:] ```python stream_capsule = chunked.__arrow_c_stream__() chunked2 = chunked._import_from_c_capsule(stream_capsule) chunked2 ``` <pyarrow.lib.ChunkedArray object at 0x105bb70b0> [ [ 0, 1, 2 ], [ 3, 4, 5 ] ] * Closes: apache#39984 Lead-authored-by: Dewey Dunnington <[email protected]> Co-authored-by: Dewey Dunnington <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
- Loading branch information