Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: disallow leading zeros in RLP decoding #335

Merged
merged 10 commits into from
Oct 31, 2023

Conversation

Rjected
Copy link
Contributor

@Rjected Rjected commented Oct 19, 2023

Motivation

Currently Uint RLP decoding ignores leading zeros.

Solution

Disallow leading zeros.

PR Checklist

  • Added Tests
  • Added Documentation
  • Updated the changelog

@prestwich
Copy link
Collaborator

This would be a breaking behavior change. Is there a motivating reason?

@codecov
Copy link

codecov bot commented Oct 19, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (8089eab) 80.48% compared to head (867476a) 80.46%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #335      +/-   ##
==========================================
- Coverage   80.48%   80.46%   -0.03%     
==========================================
  Files          54       54              
  Lines        6140     6179      +39     
==========================================
+ Hits         4942     4972      +30     
- Misses       1198     1207       +9     
Files Coverage Δ
src/support/alloy_rlp.rs 100.00% <100.00%> (ø)
src/support/fastrlp.rs 98.07% <92.85%> (-0.82%) ⬇️

... and 4 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Rjected
Copy link
Contributor Author

Rjected commented Oct 19, 2023

This would be a breaking behavior change. Is there a motivating reason?

Sorry, should have been more specific in the PR description. The main motivation is that leading zeros on RLP uints is defined to be invalid, and encoding would also never output uints encoded this way. This also makes writing certain kinds of fuzz tests possible, specifically fuzz tests that:

  • Take fuzz data as input
  • Decode TX or other struct with an int
  • if successful, encode struct
  • compare encoding output to bytes input

If we accept invalid RLP input, and are more permissive in decoding than encoding as a result, these fuzz tests quickly fail and do not act as meaningful fuzz tests.

@Rjected
Copy link
Contributor Author

Rjected commented Oct 27, 2023

@prestwich just bumping, any thoughts on this given the context?

@prestwich
Copy link
Collaborator

The ethereum.org spec is clear as mud, but the yellowpaper appears to be unambiguous about this (see (197)) . Accepting the non-minimal uint is off-spec, so we should fix it. However, because this silently changes behavior in a way that breaks downstream users, we should release it as a major version bump

Copy link
Collaborator

@prestwich prestwich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the changelog :)

Copy link
Owner

@recmo recmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Bug fix. Not a breaking change.

src/support/alloy_rlp.rs Show resolved Hide resolved
src/support/alloy_rlp.rs Outdated Show resolved Hide resolved
// leading zeros
if !bytes.is_empty() && bytes[0] == 0 {
return Err(Error::LeadingZero);
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RLP spec explicitly says

integers must be represented in big-endian binary form with no leading zeroes (thus making the integer value zero equivalent to the empty byte array).

meaning that the old accepting behaviour was not according to spec and a bug. Fixing bugs is not a breaking change.

There's 'robustness principle' argument that could be made to make this error optionally a warning (i.e. introduce a non-strict mode). But that seems more effort than it's worth. Unless someone volunteers the strict mode should be default.

@recmo
Copy link
Owner

recmo commented Oct 28, 2023

If we accept invalid RLP input, and are more permissive in decoding than encoding as a result, these fuzz tests quickly fail and do not act as meaningful fuzz tests.

This proposed test assumes that encoding is canonical. This assumption is only true when leading zeros are rejected.

[ethereum.org] and [yellow paper]

Both yellow paper and ethereum.org spec clearly state leading zeros are an error. While neither mention it explicitly, the implicit goal is to make it such that there is only one canonical RLP encoding. This is important since it RLP encoding is often used to hash data structures. Having multiple valid hashes for the same data structure leads to all forms of madness.

@recmo
Copy link
Owner

recmo commented Oct 28, 2023

However, because this silently changes behavior in a way that breaks downstream users, we should release it as a major version bump

The accepting behavior was a bug, any downstream users depending on this bug are in error (as is clear by the RLP spec). IMO this a bug fix and not a breaking change.

@prestwich
Copy link
Collaborator

I've also submitted a PR to the RLP spec doc on the Ethereum website to disambiguate this

the worst part of this is how it breaks the layer boundary between RLP and "higher-order" protocols. You cannot accurately decode RLP without knowing whether the higher-order protocol contains positive integers

ethereum/ethereum-org-website#11532

@Rjected
Copy link
Contributor Author

Rjected commented Oct 30, 2023

Updated the changelog and made other requested changes, @prestwich mind reviewing again?

Copy link
Contributor

@DaniPopes DaniPopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also update fastrlp? It should be an easy copy-paste

@Rjected
Copy link
Contributor Author

Rjected commented Oct 30, 2023

Can you also update fastrlp? It should be an easy copy-paste

good catch, will do

Copy link
Collaborator

@prestwich prestwich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more changelog nit

CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@prestwich prestwich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙌

@prestwich prestwich merged commit 8229a30 into recmo:main Oct 31, 2023
19 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants