Declarative RLP Encoding/Decoding #7975

emlautarom1 · 2024-12-26T19:57:41Z

Changes

Introduce an alternative approach to RLP encoding and decoding, based on a declarative API with support for code generation through Source Generators

Types of changes

What types of changes does your code introduce?

Bugfix (a non-breaking change that fixes an issue)
New feature (a non-breaking change that adds functionality)
Breaking change (a change that causes existing functionality not to work as expected)
Optimization
Refactoring
Documentation update
Build-related changes
Other: Description

Testing

Requires testing

Yes
No

If yes, did you write tests?

Yes
No

Notes on testing

The core library has 100% test coverage. Source generated code might not be fully covered.

Documentation

Requires documentation update

Yes
No

Requires explanation in Release Notes

Yes
No

Remarks

When we started working on refactoring our TxDecoder one thing that came up was how unergonomic is to work with our current RLP API. We even have some comments on the code itself mentioning these difficulties, for example:

nethermind/src/Nethermind/Nethermind.Serialization.Rlp/Eip2930/AccessListDecoder.cs

Lines 17 to 23 in b81070d

    
           /// <summary> 
        
           /// We pay a high code quality tax for the performance optimization on RLP. 
        
           /// Adding more RLP decoders is costly (time wise) but the path taken saves a lot of allocations and GC. 
        
           /// Shall we consider code generation for this? We could potentially generate IL from attributes for each 
        
           /// RLP serializable item and keep it as a compiled call available at runtime. 
        
           /// It would be slightly slower but still much faster than what we would get from using dynamic serializers. 
        
           /// </summary>

nethermind/src/Nethermind/Nethermind.Serialization.Rlp/Eip2930/AccessListDecoder.cs

Lines 76 to 81 in b81070d

    
           /// <summary> 
        
           /// We pay a big copy-paste tax to maintain ValueDecoders but we believe that the amount of allocations saved 
        
           /// make it worth it. To be reviewed periodically. 
        
           /// Question to Lukasz here -> would it be fine to always use ValueDecoderContext only? 
        
           /// I believe it cannot be done for the network items decoding and is only relevant for the DB loads. 
        
           /// </summary>

This PR introduces a new RLP API based on #7334 (comment) with several improvements:

Describe the structure of a record and get encoding and decoding for free. No code duplication required.
Records can be described using other records. Supports conditional, exceptions, function calls, etc.
Decoding and encoding are extensible through classes that can be defined anywhere, plus some extension methods.
Minimal core library with 100% code coverage.
Supports backtracking.
All function calls are known ahead of time (no virtual or override). Interfaces are only used to enforce implementations.
Despite the extensive usage of lambdas, no closures are required (all lambdas are static). You can still use them if you want to, but overloads are provided to avoid them.
Automatically generate the required code through Source Generators.

- Test for Set Theoretical Representation

- Extend instances

- Good things happen to those who respect symmetry.

- Code works, but compiler complains

- Primitives are integers, byte sequences, and lists - No need for `IntRlpConverter` (covered by primitives)

- Replace virtual calls with conditional and inner state flag - Refactor call sites

- Rename to `FastRlp` to avoid conflicts

emlautarom1 · 2024-12-30T16:25:32Z

Updated generators to use string interpolation. There are some places where we still use StringBuilder though.

…ature/declarative-rlp

LukaszRozmej · 2024-12-31T11:19:24Z

This is how the generated code looks like for the Rose Tree (formatted):

Ok but can we do something real-life for us? Like BlockHeader for example?

LukaszRozmej

What about RlpBehaviors?

LukaszRozmej · 2024-12-31T11:22:35Z

src/Nethermind/Nethermind.Serialization.FluentRlp/RlpWriter.cs

+
+    public int Length { get; private set; }
+
+    private byte[] _buffer;


We often write into Netty arena-based buffers to avoid allocations, would be good to support that.

Scooletz

What an interesting take on this topic 😍 A few remarks provided as comments. General:

Probably [SkipLocalInit] would be beneficial.
Benchmarking with potential ASM output
Converting some of the existing ones and performing mano-a-mano comparison.

Scooletz · 2025-01-02T09:18:35Z

src/Nethermind/Nethermind.Serialization.FluentRlp/Int32Primitive.cs

+    public static int Read(ReadOnlySpan<byte> source)
+    {
+        Span<byte> buffer = stackalloc byte[sizeof(Int32)];
+        source.CopyTo(buffer[^source.Length..]);


Unsafe.ReadUnalaligned? Why copy?

This method was added due to BinaryPrimitives.ReadInt32BigEndian requiring exactly 4 byte to read a Int32, so we pad source with enough 0 so it can properly decode.

I've never used Unsafe.ReadUnaligned, not even usafe code. How would that look?

Why not use Bytes class we already have?

Didn't think about it, but seems like implementations are the same:

nethermind/src/Nethermind/Nethermind.Core/Extensions/Bytes.cs

Lines 439 to 441 in c874877

Span<byte> fourBytes = stackalloc byte[4];

bytes.CopyTo(fourBytes[(4 - bytes.Length)..]);

return BinaryPrimitives.ReadInt32BigEndian(fourBytes);

Seems a bit overkill to add a project reference for 3 LOCs.

Scooletz · 2025-01-02T09:20:32Z

src/Nethermind/Nethermind.Serialization.FluentRlp/Rlp.cs

+        action(ref lengthWriter, ctx);
+        var serialized = new byte[lengthWriter.Length];
+        var contentWriter = RlpWriter.ContentWriter(serialized);
+        action(ref contentWriter, ctx);


When is it used? Do we write the data twice?

When is what used? We "write" the data twice to first compute the length (LengthWriter), and then we actually write the bytes into a buffer (ContentWriter).

Scooletz · 2025-01-02T09:24:22Z

src/Nethermind/Nethermind.Serialization.FluentRlp/RlpWriter.cs

+        };
+    }
+
+    public static RlpWriter ContentWriter(byte[] buffer)


Have you considered adding an option aligned with Utf8JsonWriter where there's a ctor that accepts IBufferWriter<byte>? This would mean that ufortunately dicontinued chunks should be supported, but could allow to provide a writer over anything. Maybe this could help to address Netty comment from @LukaszRozmej

https://learn.microsoft.com/en-us/dotnet/api/system.text.json.utf8jsonwriter.-ctor?view=net-9.0#system-text-json-utf8jsonwriter-ctor(system-buffers-ibufferwriter((system-byte))-system-text-json-jsonwriteroptions)

Interesting. I picked byte[] as a safe default but there is no reason why other type could not be used.

IBufferWriter<T> is quite small, it's part of the std lib and it supports Span-based APIs.

NettyRlpStream is based on IByteBuffer, the latter which is defined in DotNetty.Buffers. It supports the operations that we need but it's quite large.

Now, the issue is that IByteBuffer and IBufferWriter are unrelated so we would need to pick one (unless we start doing conversions).

IBufferWriter is probably more useful, we can try making adapter to IByteBuffer if we want

Scooletz · 2025-01-02T11:25:52Z

src/Nethermind/Nethermind.Serialization.FluentRlp.Generator/RlpSourceGenerator.cs

+
+    public void Initialize(IncrementalGeneratorInitializationContext context)
+    {
+        var provider = context.SyntaxProvider.CreateSyntaxProvider(


Consider attribute based creation using ForAttributeWithMetadataName. It should be much more cheaper than scanning all the records and only then select on attribute basis.

emlautarom1 · 2025-01-02T14:36:24Z

@LukaszRozmej RlpBehaviors are not explicitly supported by the API but you can get the same behavior by manually passing any "context" when reading/writing.

Note that we recently changed the Rlp interface so trailing bytes throw by default. If you want more control over what happens before or after reading/writing you can use the RlpReader and RlpWriter APIs directly.

emlautarom1 · 2025-01-02T18:24:20Z

I've added a benchmark that encodes and decodes an AccessList as defined in:

nethermind/src/Nethermind/Nethermind.Core/Eip2930/AccessList.cs

Line 15 in e0c4a59

    
           public class AccessList : IEnumerable<(Address Address, AccessList.StorageKeysEnumerable StorageKeys)>

Results on my machine are the following:

| Method  | Mean     | Error   | StdDev  | Ratio |
|-------- |---------:|--------:|--------:|------:|
| Current | 343.9 us | 1.43 us | 1.34 us |  1.00 |
| Fluent  | 834.9 us | 2.34 us | 2.19 us |  2.43 |

There is room for a possible optimization: some records like Address have a known fixed byte size which we can leverage to avoid having to copy bytes twice: once to figure out the length and the other to actually copy the bytes.

- Nice speedup

emlautarom1 · 2025-01-02T18:47:08Z

Replacing Marshal.SizeOf<T>() with sizeof(T) and some unsafe annotations gives quite the boost at no cost:

| Method  | Mean     | Error   | StdDev  | Ratio | RatioSD |
|-------- |---------:|--------:|--------:|------:|--------:|
| Current | 359.8 us | 5.03 us | 4.70 us |  1.00 |    0.02 |
| Fluent  | 626.2 us | 2.90 us | 2.42 us |  1.74 |    0.02 |

…ature/declarative-rlp

emlautarom1 · 2025-01-02T19:15:31Z

src/Nethermind/Nethermind.Serialization.FluentRlp/RlpWriter.cs

+        var size = sizeof(T);
+        Span<byte> bigEndian = stackalloc byte[size];
+        value.WriteBigEndian(bigEndian);
+        bigEndian = bigEndian.TrimStart((byte)0);


TrimStart does not seem to be heavily optimized. There might be something better that we can use, specially considering that we're removing leading zeros.

emlautarom1 added 30 commits December 19, 2024 12:56

Initial Rlp + Writer

d3a3232

Rename Sequence -> List

9a90ccb

Initial RlpReader

30dda91

Initial ReadList

d3d83f9

Multiple ReadList

65f0891

- Test for Set Theoretical Representation

Restructure into separate files

b4656cd

- Extend instances

Use Rlp.Read API

a63b64f

Add HeterogeneousList test

6735d29

Add overload for Action

74cf2fd

Test UnknownLengthList

2ef4c4d

Rename converter

b3dddcb

Use custom Exception

8b19e01

Add Action overload

2c67165

Test for invalid readings

52e1f7b

Support long lists (+55 bytes)

7262407

Implement interface on ReadOnlySpanConverter

e4c2e63

Make Rlp[Reader|Writer] symmetric

35b9203

- Good things happen to those who respect symmetry.

Support ref struct

e0ec1ce

Reorder tests

d3a2e5b

Annotate as scoped

67a69df

- Code works, but compiler complains

Initial Choice implementation

dc3798b

Cleanup test

ffa62f4

Restructure internals

e626fc9

- Primitives are integers, byte sequences, and lists - No need for `IntRlpConverter` (covered by primitives)

Test for deep Choice (backtracking)

6f3efbc

Move IRlpConverter

2d2eeda

Demo user-defined record support

67abc2b

Remove versions

8f4b02a

Use ref struct over Interface

9db2ebb

- Replace virtual calls with conditional and inner state flag - Refactor call sites

Consistent error

2223845

Split tests from library

51fd038

- Rename to `FastRlp` to avoid conflicts

emlautarom1 added 3 commits December 30, 2024 12:42

Remove unused code

ac98b21

Use format string

79dc13f

Merge Array with generics

c9ba2eb

Merge branch 'master' into feature/declarative-rlp

e43f13c

emlautarom1 requested a review from LukaszRozmej December 30, 2024 16:25

emlautarom1 added 2 commits December 30, 2024 13:32

Formatting

db5408a

Merge remote-tracking branch 'origin/feature/declarative-rlp' into fe…

38eb0f0

…ature/declarative-rlp

LukaszRozmej reviewed Dec 31, 2024

View reviewed changes

Scooletz reviewed Jan 2, 2025

View reviewed changes

emlautarom1 added 3 commits January 2, 2025 10:35

Fix 0

31726b6

Remove signed integer requirement

149145b

Add BytesRead

dd57f80

emlautarom1 added 3 commits January 2, 2025 13:27

Include user using statements

40cfd25

Fix nested Generics support

f419316

Initial benchmark

d3592d8

Prefer sizeof(T) over Marshal.SizeOf<T>

a52d923

- Nice speedup

emlautarom1 and others added 4 commits January 2, 2025 15:47

Formatting

606a260

Merge branch 'master' into feature/declarative-rlp

abf143d

File encoding

2a9bee7

Merge remote-tracking branch 'origin/feature/declarative-rlp' into fe…

d0dfb12

…ature/declarative-rlp

emlautarom1 commented Jan 2, 2025

View reviewed changes

emlautarom1 requested review from Scooletz and LukaszRozmej and removed request for Scooletz January 2, 2025 19:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Declarative RLP Encoding/Decoding #7975

Declarative RLP Encoding/Decoding #7975

emlautarom1 commented Dec 26, 2024 •

edited

Loading

emlautarom1 commented Dec 30, 2024

LukaszRozmej commented Dec 31, 2024

LukaszRozmej left a comment

LukaszRozmej Dec 31, 2024

Scooletz left a comment •

edited

Loading

Scooletz Jan 2, 2025

emlautarom1 Jan 2, 2025

LukaszRozmej Jan 2, 2025

emlautarom1 Jan 2, 2025

Scooletz Jan 2, 2025

emlautarom1 Jan 2, 2025

Scooletz Jan 2, 2025

emlautarom1 Jan 2, 2025 •

edited

Loading

LukaszRozmej Jan 2, 2025

Scooletz Jan 2, 2025

emlautarom1 commented Jan 2, 2025

emlautarom1 commented Jan 2, 2025

emlautarom1 commented Jan 2, 2025

emlautarom1 Jan 2, 2025

	/// <summary>
	/// We pay a high code quality tax for the performance optimization on RLP.
	/// Adding more RLP decoders is costly (time wise) but the path taken saves a lot of allocations and GC.
	/// Shall we consider code generation for this? We could potentially generate IL from attributes for each
	/// RLP serializable item and keep it as a compiled call available at runtime.
	/// It would be slightly slower but still much faster than what we would get from using dynamic serializers.
	/// </summary>

	/// <summary>
	/// We pay a big copy-paste tax to maintain ValueDecoders but we believe that the amount of allocations saved
	/// make it worth it. To be reviewed periodically.
	/// Question to Lukasz here -> would it be fine to always use ValueDecoderContext only?
	/// I believe it cannot be done for the network items decoding and is only relevant for the DB loads.
	/// </summary>


		public int Length { get; private set; }

		private byte[] _buffer;

	Span<byte> fourBytes = stackalloc byte[4];
	bytes.CopyTo(fourBytes[(4 - bytes.Length)..]);
	return BinaryPrimitives.ReadInt32BigEndian(fourBytes);

Declarative RLP Encoding/Decoding #7975

Are you sure you want to change the base?

Declarative RLP Encoding/Decoding #7975

Conversation

emlautarom1 commented Dec 26, 2024 • edited Loading

Changes

Types of changes

What types of changes does your code introduce?

Testing

Requires testing

If yes, did you write tests?

Notes on testing

Documentation

Requires documentation update

Requires explanation in Release Notes

Remarks

emlautarom1 commented Dec 30, 2024

LukaszRozmej commented Dec 31, 2024

LukaszRozmej left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Scooletz left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

emlautarom1 Jan 2, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

emlautarom1 commented Jan 2, 2025

emlautarom1 commented Jan 2, 2025

emlautarom1 commented Jan 2, 2025

Choose a reason for hiding this comment

emlautarom1 commented Dec 26, 2024 •

edited

Loading

Scooletz left a comment •

edited

Loading

emlautarom1 Jan 2, 2025 •

edited

Loading