-
Notifications
You must be signed in to change notification settings - Fork 28
Optimise SSZ encoding and decoding #55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
michaelsproul
wants to merge
13
commits into
main
Choose a base branch
from
optimise-decode
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
62f785b
Optimise FixedVector Decode
michaelsproul b729e99
Add encoding benchmarks
michaelsproul 2f52ad0
u8 encoding benchmark
michaelsproul 56fd2f8
Update benchmarks to include encoding, optimise VariableList
michaelsproul 31aee18
Remove path patch
michaelsproul b042216
Remove unnecessary pub
michaelsproul 37e9308
Fix trailing bytes bug and add more tests
michaelsproul 2eecf7a
Remove junk
michaelsproul 902121a
Merge remote-tracking branch 'origin/main' into optimise-decode
michaelsproul 6bca389
More tests for oversize FixedVector
michaelsproul 7f2505b
Remove temp ByteVector from benches
michaelsproul da45efa
Fix tests
michaelsproul d87e180
Merge remote-tracking branch 'origin/main' into optimise-decode
michaelsproul File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,6 +4,7 @@ use derivative::Derivative; | |
use serde::Deserialize; | ||
use serde_derive::Serialize; | ||
use std::marker::PhantomData; | ||
use std::mem; | ||
use std::ops::{Deref, DerefMut, Index, IndexMut}; | ||
use std::slice::SliceIndex; | ||
use tree_hash::Hash256; | ||
|
@@ -302,6 +303,25 @@ where | |
len: 0, | ||
expected: 1, | ||
}) | ||
} else if mem::size_of::<T>() == 1 && mem::align_of::<T>() == 1 { | ||
if bytes.len() != fixed_len { | ||
return Err(ssz::DecodeError::BytesInvalid(format!( | ||
"FixedVector of {} items has {} items", | ||
fixed_len, | ||
bytes.len(), | ||
))); | ||
} | ||
|
||
// Safety: We've verified T is u8, so Vec<T> is Vec<u8> | ||
// and bytes.to_vec() produces Vec<u8> | ||
let vec_u8 = bytes.to_vec(); | ||
let vec_t = unsafe { std::mem::transmute::<Vec<u8>, Vec<T>>(vec_u8) }; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should add a test for a type that is not |
||
Self::new(vec_t).map_err(|e| { | ||
ssz::DecodeError::BytesInvalid(format!( | ||
"Wrong number of FixedVector elements: {:?}", | ||
e | ||
)) | ||
}) | ||
} else if T::is_ssz_fixed_len() { | ||
let num_items = bytes | ||
.len() | ||
|
@@ -311,17 +331,24 @@ where | |
if num_items != fixed_len { | ||
return Err(ssz::DecodeError::BytesInvalid(format!( | ||
"FixedVector of {} items has {} items", | ||
num_items, fixed_len | ||
fixed_len, num_items | ||
))); | ||
} | ||
|
||
// Check that we have a whole number of items and that it is safe to use chunks_exact | ||
if bytes.len() % T::ssz_fixed_len() != 0 { | ||
return Err(ssz::DecodeError::BytesInvalid(format!( | ||
"FixedVector of {} items has {} bytes", | ||
num_items, | ||
bytes.len() | ||
))); | ||
} | ||
|
||
let vec = bytes.chunks(T::ssz_fixed_len()).try_fold( | ||
Vec::with_capacity(num_items), | ||
|mut vec, chunk| { | ||
vec.push(T::from_ssz_bytes(chunk)?); | ||
Ok(vec) | ||
}, | ||
)?; | ||
let mut vec = Vec::with_capacity(num_items); | ||
for chunk in bytes.chunks_exact(T::ssz_fixed_len()) { | ||
vec.push(T::from_ssz_bytes(chunk)?); | ||
} | ||
|
||
Self::new(vec).map_err(|e| { | ||
ssz::DecodeError::BytesInvalid(format!( | ||
"Wrong number of FixedVector elements: {:?}", | ||
|
@@ -476,6 +503,39 @@ mod test { | |
ssz_round_trip::<FixedVector<u16, U8>>(vec![0; 8].try_into().unwrap()); | ||
} | ||
|
||
// Test byte decoding (we have a specialised code path with unsafe code that NEEDS coverage). | ||
#[test] | ||
fn ssz_round_trip_u8_len_1024() { | ||
ssz_round_trip::<FixedVector<u8, U1024>>(vec![42; 1024].try_into().unwrap()); | ||
ssz_round_trip::<FixedVector<u8, U1024>>(vec![0; 1024].try_into().unwrap()); | ||
} | ||
|
||
#[test] | ||
fn ssz_u8_len_1024_too_long() { | ||
assert_eq!( | ||
FixedVector::<u8, U1024>::from_ssz_bytes(&vec![42; 1025]).unwrap_err(), | ||
ssz::DecodeError::BytesInvalid("FixedVector of 1024 items has 1025 items".into()) | ||
); | ||
} | ||
|
||
#[test] | ||
fn ssz_u64_len_1024_too_long() { | ||
assert_eq!( | ||
FixedVector::<u64, U1024>::from_ssz_bytes(&vec![42; 8 * 1025]).unwrap_err(), | ||
ssz::DecodeError::BytesInvalid("FixedVector of 1024 items has 1025 items".into()) | ||
); | ||
} | ||
|
||
// Decoding an input with invalid trailing bytes MUST fail. | ||
#[test] | ||
fn ssz_bytes_u64_trailing() { | ||
let bytes = [1, 0, 0, 0, 2, 0, 0, 0, 1]; | ||
assert_eq!( | ||
FixedVector::<u32, U2>::from_ssz_bytes(&bytes).unwrap_err(), | ||
ssz::DecodeError::BytesInvalid("FixedVector of 2 items has 9 bytes".into()) | ||
); | ||
} | ||
|
||
#[test] | ||
fn tree_hash_u8() { | ||
let fixed: FixedVector<u8, U0> = FixedVector::try_from(vec![]).unwrap(); | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using
size_of
andalign_of
removes the need to add any bounds onT
. Other crates likebytemuck
would have required us to constrainT
quite a lot, which sort of defeats the point of a generic impl.Even using
TypeId
would have required us to add'static
.The other advantage of this is that it works for types that encode as
u8
, like theParticipationFlag
s in theBeaconState
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couldn't
T
be abool
as well?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I guess it could be, but we never use bools in any consensus data structures, do we?
Would be a good test case