Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds lazy reader support for blobs #629

Merged
merged 36 commits into from
Sep 1, 2023
Merged

Adds lazy reader support for blobs #629

merged 36 commits into from
Sep 1, 2023

Conversation

zslayton
Copy link
Contributor

Builds on outstanding PRs #612, #613, #614, #616, #617, #619, #620, #621, #622, #623, #627, and #628.

Adds lazy reader support for blobs.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Copy link
Contributor Author

@zslayton zslayton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ PR tour

Comment on lines +7 to +9
pub struct BytesRef<'data> {
data: Cow<'data, [u8]>,
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ When there was only a binary reader, methods that read blobs could always return a &[u8]--a slice of the input buffer. Now that we also have a text reader, we need to accommodate base64-encoded blobs, which always require a new Vec to be allocated to hold the decoded data. BytesRef can hold either a borrowed &[u8] or an owned Vec<u8>, allowing it to be used in either situation.

This type is analogous to StrRef and SymbolRef but for blobs.

@@ -18,7 +19,7 @@ pub enum RawValueRef<'data, D: LazyDecoder<'data>> {
Timestamp(Timestamp),
String(StrRef<'data>),
Symbol(RawSymbolTokenRef<'data>),
Blob(&'data [u8]),
Blob(BytesRef<'data>),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ RawValueRef now returns a BytesRef instead of a &[u8] so it has the option to allocate a Vec<u8> when the input encoding is base64 text. The binary reader can still return a slice of the input buffer.

@@ -23,7 +24,7 @@ pub enum ValueRef<'top, 'data, D: LazyDecoder<'data>> {
Timestamp(Timestamp),
String(StrRef<'data>),
Symbol(SymbolRef<'top>),
Blob(&'data [u8]),
Blob(BytesRef<'data>),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Like RawValueRef, ValueRef now returns a BytesRef instead of a &[u8] so it has the option to allocate a Vec<u8> when the input encoding is base64 text. The binary reader can still return a slice of the input buffer.

@codecov
Copy link

codecov bot commented Aug 20, 2023

Codecov Report

Patch coverage is 80.95% of modified lines.

Files Changed Coverage
src/lazy/text/encoded_value.rs 0.00%
src/lazy/text/value.rs 0.00%
src/lazy/bytes_ref.rs 50.00%
src/lazy/text/matched.rs 87.50%
src/lazy/binary/raw/value.rs 100.00%
src/lazy/raw_value_ref.rs 100.00%
src/lazy/text/buffer.rs 100.00%
src/lazy/text/raw/reader.rs 100.00%
src/lazy/value_ref.rs 100.00%

📢 Thoughts on this report? Let us know!.

@zslayton zslayton marked this pull request as ready for review August 20, 2023 22:13
Copy link
Contributor

@popematt popematt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there are some edge cases that might need to be fixed, but I might be wrong.

src/lazy/text/buffer.rs Show resolved Hide resolved
src/lazy/text/buffer.rs Outdated Show resolved Hide resolved
src/lazy/text/buffer.rs Show resolved Hide resolved
@zslayton zslayton self-assigned this Aug 29, 2023
Base automatically changed from lazy-decimals to main September 1, 2023 15:21
@zslayton zslayton merged commit f728e08 into main Sep 1, 2023
18 checks passed
@zslayton zslayton deleted the lazy-blobs branch September 1, 2023 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants