Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup of deprecated functions + Behavioral Clarifications #307

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

roypat
Copy link
Collaborator

@roypat roypat commented Nov 12, 2024

Summary of the PR

This PR aim to clean up the API surface of the vm-memory crate in a way that is as unintrusive as possible. It does this in two ways

  • Remove the myriad of deprecated functions we have accumulated over the years, and
  • Improve documentation to capture some of the more quirky edge cases some functions have (completely ignoring the question of whether some of these should be changed for now)

Additionally, it fixes up a goof I made when I introduced the ReadVolatile and WriteVolatile traits in #247, where I put the read_volatile_from and friends funcions into the wrong trait. When it comes to how to implement these traits for GuestMemoryMmap and VolatileSlice, I've deferred to how the old Read and Write based functions were implemented for them.

I've compiled the vm-virtio workspace and linux-loader against this. All tests pass in both, only linux-loader needs a tiny change to import the Bytes trait now.

If one squints at the code a bit, one might ask whether all the read/read_slice functions in the Bytes trait could be default functions, implemented in terms of the volatile functions (since &mut [u8] implements the volatile access traits). I played around with this, but there are subtle functional and performance changes involved in that (although it would be a great simplification), which is why I'm holding off on doing anything there until a later PR.

Requirements

Before submitting your PR, please make sure you addressed the following
requirements:

  • All commits in this PR have Signed-Off-By trailers (with
    git commit -s), and the commit message has max 60 characters for the
    summary and max 75 characters for each description line.
  • All added/changed functionality has a corresponding unit/integration
    test.
  • All added/changed public-facing functionality has entries in the "Upcoming
    Release" section of CHANGELOG.md (if no such section exists, please create one).
  • Any newly added unsafe code is properly documented.

@TimePrinciple
Copy link
Contributor

With this change, the next release should be 0.17.0 I guess 🤔

@roypat roypat force-pushed the deprecation-cleanup branch from 8d0f62f to 559c3c6 Compare November 13, 2024 07:21
This fixes a cargo warning printed in recent toolchains. Since we do not
support toolchains older than 1.38, there is no need to symlink to the
new file to the old one.

Signed-off-by: Patrick Roy <[email protected]>
@roypat roypat force-pushed the deprecation-cleanup branch from 559c3c6 to becfd0e Compare November 13, 2024 07:23
@roypat
Copy link
Collaborator Author

roypat commented Nov 13, 2024

With this change, the next release should be 0.17.0 I guess 🤔

Yup!

Copy link
Contributor

@TimePrinciple TimePrinciple left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really good work, some thoughts 😉

Comment on lines 1013 to -1015
fn get_slice(&self, offset: usize, count: usize) -> Result<VolatileSlice<B>> {
let _ = self.compute_end_offset(offset, count)?;
Ok(
// SAFETY: This is safe because the pointer is range-checked by compute_end_offset, and
// the lifetime is the same as self.
unsafe {
VolatileSlice::with_bitmap(
self.addr.add(offset),
count,
self.bitmap.slice_at(offset),
self.mmap,
)
},
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case, would it be better to use a macro to expand code here instead of calling one another 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer keeping the duplicated code over a macro if we think that we don't want to have another layer of indirection here (which is a fair point, the indirection in this crate are already fairly complex as-is). But the compiler should be clever enough to figure out that it can inline the call to subslice, so no actual subroutine call should happen.

Comment on lines -684 to -687
self.try_access(count, addr, |offset, len, caddr, region| -> Result<usize> {
// Check if something bad happened before doing unsafe things.
assert!(offset <= count);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about completely remove the offset parameter, and change try_access accordingly.

In my view the only place might need the assertion is before the first execution of f(total, len as usize, start, region) in try_access, but since total defaults to 0 and len is an usize, the assertion there is pointless. The rest cases are well guarded in Ok(len) match arm inside try_access.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The offset parameter is used in the read and write implementations in Bytes<GuestAddress> for T: GuestMemory :/

Well, technically, we cannot rely on the actual behavior of try_access, since it's a trait function, and some arbitrary implementation of GuestMemory could overwrite it with some absolute nonsense. The main point for me here is that even in that case, it will not cause unsoundness, because there's no unsafe code here, and all bounds are properly checked further down the stack.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The offset parameter is used in the read and write implementations in Bytes<GuestAddress> for T: GuestMemory :/

Oh I didn't read that far, let's just remove the assertion here then 😉

@roypat roypat force-pushed the deprecation-cleanup branch from 777d39e to 7be0764 Compare November 13, 2024 10:33
The call to `.subslice()` ensures that the range `[0, total]` is a valid
subslice of the `buf: VolatileSlice` passed in. But `total =
min(buf.len(), self.len()) <= buf.len()`, and thus `[0, total]` is
definitely a valid subslice of `buf`.

The `copy_{from,to}_volatile_slice` functions do not actually care about
the length of the `VolatileSlice` passed in - it relies on the safety
invariant to ensure that the passed slice has length at least `total`.
Thus it doesn't matter if we pass a slice of length `total`, or of a
length greater than `total`. It will simply access the first `total`
bytes in the slice.

Also clarify the safety comment, as some slightly mistakes seemingly
snuck in when copying them.

Signed-off-by: Patrick Roy <[email protected]>
@roypat roypat force-pushed the deprecation-cleanup branch from 7be0764 to 8fd12ef Compare November 13, 2024 11:53
src/volatile_memory.rs Outdated Show resolved Hide resolved
)
},
)
self.subslice(offset, count)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we deprecate subslice and remove it in a further release?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhh, we could. But tbh, slice.get_slice(start, count) reads a bit weirdly (and conversely, so does region.subslice(start, count)) :/

/// assert_eq!(slice.truncate(3).len(), 3);
/// assert_eq!(slice.truncate(10).len(), 5);
/// ```
pub fn truncate(&self, count: usize) -> VolatileSlice<'a, B> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Truncate somehow implies (to me) that we're updating self because of how truncate on Vec works: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.truncate.

Shouldn't we be able to obtain the same result by calling subslice with offset = 0 & count as the size we want to "truncate" to?

Copy link
Collaborator Author

@roypat roypat Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh, true. Naming is hard. Maybe .truncated() instead (e.g. with an extra 'd')?

Shouldn't we be able to obtain the same result by calling subslice with offset = 0 & count as the size we want to "truncate" to?

It'd need to be count = min(size we want to be, length). I didn't like it because subslice does a million bounds checks that are all unnecessary for these arguments. But I should probably actually benchmark it/see if the compiler manages to optimize it away

EDIT: Ah yeah, and because truncate doesn't return a Result, which makes it a bit nicer to use imo.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn, criterion indeed says there's no difference between them. Guess rustc is better than I thought at optimizing xP

So my only argument is "it avoids .unwrap()-ing something that we know can never fail"

Fix the doc comment, and add a unit test for this functions.
Particularly the "allow length 0 accesses at the end of the slice"
behavior was undocumented before.

Signed-off-by: Patrick Roy <[email protected]>
The comment on the assertion was wrong, as we are not even doing
anything unsafe in the first place. Additionally, the `offset` variable
is unused by these functions, so the assertion is at best a sanity check
that the `try_access` implementation is correct, although I don't
particularly see the value of that. Remove the assertion to prevent
confusion.

Signed-off-by: Patrick Roy <[email protected]>
The two functions contain exactly the same body, so just have one call
the other.

No functional change intended.

Signed-off-by: Patrick Roy <[email protected]>
Add a macro that automatically retries a given I/O operation if EINTR is
returned. This is a fairly common pattern in vm-memory.

Signed-off-by: Patrick Roy <[email protected]>
This function reduces the length of a volatile slice. Many of the I/O
primitives rely on volatile slice lengths to determine how many bytes
should be read/written, and in cases where we want to read/write "up to
X bytes", calling `.truncate` is needed.

Signed-off-by: Patrick Roy <[email protected]>
The old `Read`/`Write` based APIs for operating on guest memory have
been deprecated for a fairly long time now, and are superseceded by
their `ReadVolatile`/`WriteVolatile` equivalents.

The `fold` and friends functions have been deprecated for way longer.

Various `.as_ptr` APIs in volatile_memory.rs are deprecated in favor of
pointer guards.

Let's clean up the crate and remove all of these.

Signed-off-by: Patrick Roy <[email protected]>
The `read_[exact_]_volatile_from` and `write_[all_]volatile_to`
functions were intended to be replacements for their `Read` and `Write`
based counterparts. However, those used to live in the `Bytes` trait,
not the `GuestMemory` trait. Fix up this goof on my part by moving them
to the `Bytes` trait.

Signed-off-by: Patrick Roy <[email protected]>
Capture the actual behavior of various edge cases around empty buffers
and containers in the doc comment.

Signed-off-by: Patrick Roy <[email protected]>
Signed-off-by: Patrick Roy <[email protected]>
@roypat roypat force-pushed the deprecation-cleanup branch from 8fd12ef to e4f291d Compare December 4, 2024 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants