Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable accessing written data in a BorrowedCursor #367

Closed
a1phyr opened this issue Apr 8, 2024 · 1 comment
Closed

Enable accessing written data in a BorrowedCursor #367

a1phyr opened this issue Apr 8, 2024 · 1 comment
Labels
ACP-accepted API Change Proposal is accepted (seconded with no objections) api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api

Comments

@a1phyr
Copy link

a1phyr commented Apr 8, 2024

Proposal

Problem statement

Quoting documentation of BorrowedCursor:

Once data is written to the cursor, it becomes part of the filled portion of the underlying BorrowedBuf and can no longer be accessed or re-written by the cursor.

However, doing so may be really useful, for example in Read wrappers that read back the data read in the inner reader. With the current API, read_buf can only be implemented by initializing the whole buffer and forwarding to read or using unsafe code to craft a new BorrowedCursor.

Motivating examples or use cases

A crc32 checker example simplified from zip crate (original source):

pub struct Crc32Reader<R> {
    inner: R,
    hasher: Hasher,
    check: u32,
}

impl<R> Crc32Reader<R> {
    fn check_matches(&self) -> bool {
        self.check == self.hasher.clone().finalize()
    }
}

impl<R: Read> Read for Crc32Reader<R> {
    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
        let count = self.inner.read(buf)?;
        if count == 0 && !buf.is_empty() && !self.check_matches() {
            return Err(io::Error::new(io::ErrorKind::Other, "Invalid checksum"))
        }
        self.hasher.update(&buf[..count]);
        Ok(count)
    }

    fn read_buf(&mut self, mut cursor: BorrowedCursor<'_>) -> io::Result<()> {
        let written = cursor.written();
        self.inner.read_buf(cursor.reborrow())?;
        if cursor.written() == written && cursor.capacity() != 0 && !self.check_matches() {
            return Err(io::Error::new(io::ErrorKind::Other, "Invalid checksum"))
        }
        // We can't write this line
        // self.hasher.update(cursor.written_data());
        Ok(count)
    }
}

In this code, a specialized read_buf implementation that forward to self.inner.read_buf() is desirable, but not really possible without unsafe code.

Solution sketch

Add new method to BorrowedCursor that creates a BorrowedBuf from it, which would allow reading back the written data (not tested):

impl BurrowedCursor<'_> {
    fn unfilled_buf(&mut self) -> BorrowedBuf<'_> {
        // Note: this function can already be written using only public (unsafe) APIs.
        let init = self.buf.init - self.buf.filled;
 
        BorrowedBuf {
            buf: unsafe { self.as_mut() },
            filled: 0,
            init,
        }
    }
}

With this, read_buf function from the previous example could be written as:

impl<R: Read> Read for Crc32Reader<R> {
    fn read_buf(&mut self, mut cursor: BorrowedCursor<'_>) -> io::Result<()> {
        let mut buf = cursor.unfilled_buf();
        self.inner.read_buf(buf.unfilled())?;

        if buf.len() == 0 && buf.capacity() != 0 && !self.check_matches() {
            return Err(io::Error::new(io::ErrorKind::Other, "Invalid checksum"))
        }
        self.hasher.update(buf.filled());
        let init = buf.len();
        cursor.advance(init);
        Ok(())
    }
}

Alternatives

  • Do nothing and say the the current state is fine.
  • Provide the function via From<&'data mut BorrowCursor<'_>> for BorrowedBuf<'data> to make it consistent with other ways to create a BorrowedBuf
  • As is, using unfilled_buf and advance make it compulsory to have a panic branch (in advance) and the risk to forget advancing (especially in error branches). There could be a method that takes a closure and does everything right:
    impl BorrowCursor<'_> {
        fn with_unfilled_buf<T>(&mut self, f: impl FnOnce(&mut BorrowedBuf<'_>) -> T) -> T {
            let mut buf = self.unfilled_buf();
            let result = f(&mut buf);
    
            let filled = buf.len();
            // SAFETY: `filled` bytes were written to the cursor
            unsafe { self.advance_unchecked(filled) };
            result
        }
    }

Links and related work

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

  • We think this problem seems worth solving, and the standard library might be the right place to solve it.
  • We think that this probably doesn't belong in the standard library.

Second, if there's a concrete solution:

  • We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
  • We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.
@a1phyr a1phyr added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Apr 8, 2024
@joshtriplett
Copy link
Member

The approach of turning a BorrowedCursor into a new BorrowedBuf seems like a good one. We discussed this in this week's libs-api meeting and agreed that we want to approve this.

I also think with_unfilled_buf is a good helper method that will be less error-prone. The documentation for unfilled_buf should point to that as the preferred alternative.

@joshtriplett joshtriplett added the ACP-accepted API Change Proposal is accepted (seconded with no objections) label Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ACP-accepted API Change Proposal is accepted (seconded with no objections) api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api
Projects
None yet
Development

No branches or pull requests

2 participants