Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unsafe/unchecked slice functions #6901

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions arrow-buffer/src/buffer/immutable.rs
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,18 @@ impl Buffer {
"the offset of the new Buffer cannot exceed the existing length: slice offset={offset} length={length} selflen={}",
self.length
);
// Safety:
// offset + length <= self.length
unsafe { self.slice_with_length_unchecked(offset, length) }
}

/// Returns a new [Buffer] that is a slice of this buffer starting at `offset`,
/// with `length` bytes.
/// Doing so allows the same memory region to be shared between buffers.
///
/// # Safety
/// `(offset + length)` MUST be `<=` [`Self::len`].
pub unsafe fn slice_with_length_unchecked(&self, offset: usize, length: usize) -> Self {
// Safety:
// offset + length <= self.length
let ptr = unsafe { self.ptr.add(offset) };
Expand Down
37 changes: 36 additions & 1 deletion arrow-buffer/src/buffer/scalar.rs
Original file line number Diff line number Diff line change
Expand Up @@ -64,14 +64,41 @@ impl<T: ArrowNativeType> ScalarBuffer<T> {
///
/// * `offset` or `len` would result in overflow
/// * `buffer` is not aligned to a multiple of `std::mem::align_of::<T>`
/// * `bytes` is not large enough for the requested slice
/// * `buffer` is not large enough for the requested slice
pub fn new(buffer: Buffer, offset: usize, len: usize) -> Self {
let size = std::mem::size_of::<T>();
let byte_offset = offset.checked_mul(size).expect("offset overflow");
let byte_len = len.checked_mul(size).expect("length overflow");
buffer.slice_with_length(byte_offset, byte_len).into()
}

/// Create a new [`ScalarBuffer`] from a [`Buffer`], and an `offset`
/// and `length` in units of `T`
///
/// # Safety
///
/// This method will be safe UNLESS any of the following:
///
/// * `offset` or `len` would result in overflow
/// * `buffer` is not aligned to a multiple of `std::mem::align_of::<T>`
/// * `buffer` is not large enough for the requested slice
pub unsafe fn new_unchecked(buffer: Buffer, offset: usize, len: usize) -> Self {
let size = std::mem::size_of::<T>();
let byte_offset = offset * size;
let byte_len = len * size;
unsafe {
buffer
.slice_with_length_unchecked(byte_offset, byte_len)
.into()
}
}

/// The length of the scalar buffer, in units of `T`.
#[inline]
pub fn len_in_elements(&self) -> usize {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so the user can do their own range checking easily, and also so that the docstring for slice_unchecked has something to refer to.

We should maybe call this function len, but I went with the more explicit, since len usually means "number of bytes" in other contexts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deref impl already means that x.len() returns this

self.buffer.len() / std::mem::size_of::<T>()
}

/// Free up unused memory.
pub fn shrink_to_fit(&mut self) {
self.buffer.shrink_to_fit();
Expand All @@ -82,6 +109,14 @@ impl<T: ArrowNativeType> ScalarBuffer<T> {
Self::new(self.buffer.clone(), offset, len)
}

/// Returns a zero-copy slice of this buffer with length `len` and starting at `offset`
///
/// # Safety
/// `(offset + len)` <= [`Self::len_in_elements`]
pub unsafe fn slice_unchecked(&self, offset: usize, len: usize) -> Self {
unsafe { Self::new_unchecked(self.buffer.clone(), offset, len) }
}

/// Returns the inner [`Buffer`]
pub fn inner(&self) -> &Buffer {
&self.buffer
Expand Down
Loading