Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: [ADR-072] Sign Mode Unified #22714

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

raynaudoe
Copy link
Contributor

@raynaudoe raynaudoe commented Dec 2, 2024

Description

ADR for proposing a new sign mode, SIGN_MODE_UNIFIED, in replacement to SIGN_MODE_TEXTUAL.


Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • included the correct type prefix in the PR title, you can find examples of the prefixes below:
  • confirmed ! in the type prefix if API or client breaking change
  • targeted the correct branch (see PR Targeting)
  • provided a link to the relevant issue or specification
  • reviewed "Files changed" and left comments if necessary
  • included the necessary unit and integration tests
  • added a changelog entry to CHANGELOG.md
  • updated the relevant documentation or specification, including comments for documenting Go code
  • confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

Please see Pull Request Reviewer section in the contributing guide for more information on how to review a pull request.

I have...

  • confirmed the correct type prefix in the PR title
  • confirmed all author checklist items have been addressed
  • reviewed state machine logic, API design and naming, documentation is accurate, tests and test coverage

Summary by CodeRabbit

  • New Features
    • Introduced SIGN_MODE_UNIFIED, a new signing mode to simplify the signing process and enhance user experience.
  • Documentation
    • Added comprehensive documentation detailing the features, benefits, and migration strategy for SIGN_MODE_UNIFIED.

@raynaudoe raynaudoe requested a review from a team as a code owner December 2, 2024 12:21
Copy link
Contributor

coderabbitai bot commented Dec 2, 2024

📝 Walkthrough

Walkthrough

The pull request introduces a new signing mode called SIGN_MODE_UNIFIED for the Cosmos SDK, aimed at simplifying the signing process and enhancing user and developer experiences. It details the use of runtime metadata in JSON format to define message specifications, addressing challenges with existing signing modes. The document outlines the components of SIGN_MODE_UNIFIED, the signing process, and a migration strategy for transitioning from legacy modes.

Changes

File Path Change Summary
docs/architecture/adr-076-sign-mode-unified.md Added a new ADR detailing the SIGN_MODE_UNIFIED, its components, signing process, and migration strategy.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Signer
    participant SDK
    participant MerkleTree

    User->>SDK: Construct Transaction
    SDK->>Signer: Request Signing
    Signer->>SDK: Provide Metadata
    SDK->>MerkleTree: Create Merkle Structure
    MerkleTree-->>SDK: Return Merkle Tree
    SDK->>User: Present Transaction for Approval
    User->>SDK: Approve Transaction
    SDK->>Signer: Sign Transaction
    Signer-->>SDK: Return Signed Transaction
    SDK->>User: Broadcast Transaction
Loading

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (4)
docs/architecture/adr-076-sign-mode-unified.md (4)

47-47: Improve sentence clarity

The sentence can be made more concise by removing the wordy phrase "in order to".

-The metadata file is a JSON file that contains all the types and messages definitions used in the SDK, is meant to be generated at build time and stored in the node, using the protobuf definition files as source. This file (in fact, chunks of its encoded version) are sent to the signer in order to be used to decode and show the transaction's fields without the need to store locally all definitions of types and messages.
+The metadata file is a JSON file that contains all the types and messages definitions used in the SDK, is meant to be generated at build time and stored in the node, using the protobuf definition files as source. This file (in fact, chunks of its encoded version) are sent to the signer to decode and show the transaction's fields without the need to store locally all definitions of types and messages.
🧰 Tools
🪛 LanguageTool

[style] ~47-~47: Consider a shorter alternative to avoid wordiness.
Context: ...encoded version) are sent to the signer in order to be used to decode and show the transact...

(IN_ORDER_TO_PREMIUM)


288-290: Add blank line before list item

For better readability, add a blank line before the list item.

### Backwards Compatibility

+
**Preservation of Existing Sign Modes:**
* Existing sign modes remain supported to ensure that legacy wallets and clients continue to function without interruption.
🧰 Tools
🪛 Markdownlint (0.35.0)

289-289: null
Lists should be surrounded by blank lines

(MD032, blanks-around-lists)


327-327: Add newline at end of file

Add a newline character at the end of the file to follow Markdown best practices.

https://polkadot-fellows.github.io/RFCs/approved/0078-merkleized-metadata.html
+
🧰 Tools
🪛 Markdownlint (0.35.0)

327-327: null
Files should end with a single newline character

(MD047, single-trailing-newline)


168-170: Consider using a different type for coin amount

In the metadata example, the amount field of the Coin type is defined as a string (type 1). Consider using a more appropriate numeric type for coin amounts to ensure proper validation and prevent potential issues with numeric operations.

                {
                  "name": "amount",
-                  "type": 1
+                  "type": {
+                    "def": {
+                      "Primitive": "Uint256"
+                    }
+                  }
                }
📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between f350775 and fbdbeb3.

📒 Files selected for processing (1)
  • docs/architecture/adr-076-sign-mode-unified.md (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
docs/architecture/adr-076-sign-mode-unified.md (1)

Pattern **/*.md: "Assess the documentation for misspellings, grammatical errors, missing documentation and correctness"

🪛 LanguageTool
docs/architecture/adr-076-sign-mode-unified.md

[style] ~47-~47: Consider a shorter alternative to avoid wordiness.
Context: ...encoded version) are sent to the signer in order to be used to decode and show the transact...

(IN_ORDER_TO_PREMIUM)

🪛 Markdownlint (0.35.0)
docs/architecture/adr-076-sign-mode-unified.md

289-289: null
Lists should be surrounded by blank lines

(MD032, blanks-around-lists)


327-327: null
Files should end with a single newline character

(MD047, single-trailing-newline)

@@ -0,0 +1,327 @@
# ADR 075: Sign Mode Unified
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix inconsistency in ADR number

The ADR number in the title (075) doesn't match the filename (076). Please ensure consistency between the title and filename.

-# ADR 075: Sign Mode Unified
+# ADR 076: Sign Mode Unified
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# ADR 075: Sign Mode Unified
# ADR 076: Sign Mode Unified

`SIGN_MODE_UNIFIED` proposes using CBOR (Concise Binary Object Representation) for transaction encoding, offering several critical advantages:

* **Deterministic Encoding**: CBOR's canonical form ensures consistent merkleization of transaction data, which is essential for verification and proof generation.
* **Human-readable Diagnostic Notation**: While CBOR is binary, it provides a diagnostic notation format that maintains transparency and reduces blind signing risks.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Human readability was a design goal (and powerful feature) of AminoJSON (and Textual). The user is presented with a literal representation of the tx bytes which are being signed over. From https://www.rfc-editor.org/rfc/rfc8949.html

JSON: {_ "a": 1, "b": [_ 2, 3]}
CBOR: 0xbf61610161629f0203ffff

I don't think we can say that CBOR is human readable, its use in a wallet UX would be more akin to blind signing. I'm not a security expert but isn't this a bit of step backwards for verifiability?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks for pointing that out, true!
We are proposing CBOR because it enables metadata merkleization, a central feature of this new sign mode. While CBOR is not human-readable in its raw form, any UI can easily decode it to present the data to users. Blind signing would only occur if a wallet implements a "lazy" UI that doesn't decode the CBOR string, which would likely make such a wallet less appealing to users.

This approach doesn't represent a step backwards in terms of verifiability, because:

  • CBOR transactions can always be decoded for human readability. We rely on widely-tested codec libraries for this, just as we do with other formats.
  • Since the transaction contains the metadata's root hash, the signer can verify that the decoded transaction chunks comply with the metadata sent by the node. This provides a "first-order" verification.
  • Finally, since the sign doc contains the root hash (meaning the user is signing both tx_blob + metadata_root_hash), validators can verify that the transaction was displayed (and signed) using the consensual metadata tree, providing a "second-order" verification.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are proposing CBOR because it enables metadata merkleization, a central feature of this new sign mode.

What specifically about CBOR enables metadata merkleization as opposed to say JSON? It's simply an encoding format.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original question here was about human readability. Given the amount of detail that the previous spec for textual put into it human readability (or even amino JSON), I find "CBOR transactions can always be decoded for human readability" to be a pretty underwhelming answer.

Validator->>Validator: Compare with local values
```

## Decision
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were any alternatives considered? The simplest which comes to mind is a constrained JSON spec. Given that proto is the defacto IDL (it remains so in this design since the JSON metadata is derived from it) proto reflect could drive marshaling pretty easily. Downside being that clients still have a strong dependency on a proto tool chain, I think.

* *Metadata Digest*: A compact representation of the complete metadata set used for transaction construction. When hashed (digest hash), this struct serves as the root of the merkle tree, enabling efficient verification of metadata integrity.
* *Merkleized Structure*: The metadata is organized into a merkle tree, allowing for efficient proofs and verification.

### Metadata file
Copy link
Member

@kocubinski kocubinski Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a powerful abstraction and to me seems like a viable pathway to completely decouple a wallet or execution environment from protobuf. Did I get that right? If so we should push that as a big pro! I think proto has been troublesome on front ends, hardware wallets and VMs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right, the wallet / signers should be completely decoupled from the protobuf definition files

Copy link
Member

@aaronc aaronc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, I assume there are valid reasons for abandoning the SIGN_MODE_TEXTUAL effort, but what are they? It would be good to document all of these.

I do want to note that in quickly reviewing the original SIGN_MODE_TEXTUAL ADR 050 and its Annexes for comparison, a lot of attention was given towards details of the client user experience. Things like line length and display of integers and non-ASCII characters are carefully considered. This ADR, on the other hand, has only this to say regarding human readability:

* **Human-readable Diagnostic Notation**: While CBOR is binary, it provides a diagnostic notation format that maintains transparency and reduces blind signing risks.

Most of the ADR seems to focus on how schema metadata can be embedded in a node in a provable way and even goes as far as constructing a new merkle tree format. I would note that all nodes currently do make protobuf descriptors available for querying, but not in a provable way. We did actually discuss storing the protobuf descriptors in state and decided against it 1) because this introduces a separate build step (also proposed in this ADR) and 2) introduces non-determinism if a user so much as adds a comment to a .proto file in a separate compilation of a node's binary. Even if we did embed schema metadata in a provable way, why do we need a separate merkle tree structure besides the one already provided for proving state? Putting that aside, I'm really not sure why this is an issue at all. The way that we prove that the user signed what they intended to sign is by generating an equivalent sign doc in a textual format (currently amino JSON) and verifying the signature. We do not need to verify that the same metadata was used because a human readable format is inherently sign describing. So why is this provable and signed metadata needed?

A second requirement here is an incompletely specified mapping between protobuf and this new schema format and CBOR encoding. This is a pretty big deal in and of itself and I am not sure what benefits it brings. If you read through the SIGN_MODE_TEXTUAL documents, there are a lot of details on how to map the SDK's data to a human readable format. Even Amino JSON does pretty good at this. A naive mapping from protobuf to CBOR will not produce a good end user experience all by itself. This is especially true since some users are starting to encode addresses as bytes rather than strings in transactions. Both Amino JSON and SIGN_MODE_TEXTUAL provide detailed ways to map protobuf data to a human readable representation and this new mapping would need to repeat a lot of that work or risk being a downgrade.

Overall, it seems like there would be a lot of work to make this work in terms of specification, build tooling and client tooling. The end result would introduce a new required build step and there are no details as to whether the human readable format would be an upgrade or downgrade from Amino JSON. I appreciate the effort the authors put into this document and I also appreciate the efforts that people put into SIGN_MODE_TEXTUAL. From what I'm reading here I don't see any compelling reasons for why it would be worth moving to this new sign mode unified approach over the current status quo of Amino JSON.

@raynaudoe
Copy link
Contributor Author

First of all, I assume there are valid reasons for abandoning the SIGN_MODE_TEXTUAL effort, but what are they? It would be good to document all of these.

I do want to note that in quickly reviewing the original SIGN_MODE_TEXTUAL ADR 050 and its Annexes for comparison, a lot of attention was given towards details of the client user experience. Things like line length and display of integers and non-ASCII characters are carefully considered. This ADR, on the other hand, has only this to say regarding human readability:

* **Human-readable Diagnostic Notation**: While CBOR is binary, it provides a diagnostic notation format that maintains transparency and reduces blind signing risks.

Most of the ADR seems to focus on how schema metadata can be embedded in a node in a provable way and even goes as far as constructing a new merkle tree format. I would note that all nodes currently do make protobuf descriptors available for querying, but not in a provable way. We did actually discuss storing the protobuf descriptors in state and decided against it 1) because this introduces a separate build step (also proposed in this ADR) and 2) introduces non-determinism if a user so much as adds a comment to a .proto file in a separate compilation of a node's binary. Even if we did embed schema metadata in a provable way, why do we need a separate merkle tree structure besides the one already provided for proving state? Putting that aside, I'm really not sure why this is an issue at all. The way that we prove that the user signed what they intended to sign is by generating an equivalent sign doc in a textual format (currently amino JSON) and verifying the signature. We do not need to verify that the same metadata was used because a human readable format is inherently sign describing. So why is this provable and signed metadata needed?

A second requirement here is an incompletely specified mapping between protobuf and this new schema format and CBOR encoding. This is a pretty big deal in and of itself and I am not sure what benefits it brings. If you read through the SIGN_MODE_TEXTUAL documents, there are a lot of details on how to map the SDK's data to a human readable format. Even Amino JSON does pretty good at this. A naive mapping from protobuf to CBOR will not produce a good end user experience all by itself. This is especially true since some users are starting to encode addresses as bytes rather than strings in transactions. Both Amino JSON and SIGN_MODE_TEXTUAL provide detailed ways to map protobuf data to a human readable representation and this new mapping would need to repeat a lot of that work or risk being a downgrade.

Overall, it seems like there would be a lot of work to make this work in terms of specification, build tooling and client tooling. The end result would introduce a new required build step and there are no details as to whether the human readable format would be an upgrade or downgrade from Amino JSON. I appreciate the effort the authors put into this document and I also appreciate the efforts that people put into SIGN_MODE_TEXTUAL. From what I'm reading here I don't see any compelling reasons for why it would be worth moving to this new sign mode unified approach over the current status quo of Amino JSON.

Hey @aaronc, thanks for your input.
This ADR is dedicated to the technical implementation of SIGN_MODE_UNIFIED, addressing the "how" rather than the "why." The decision to transition away from SIGN_MODE_TEXTUAL (and amino json) is an already established direction. Here, we are proposing a design pattern that has been successfully implemented and validated in the Polkadot ecosystem.

@aaronc
Copy link
Member

aaronc commented Dec 3, 2024

If moving away from textual is an already established direction where is that documented? In a quick issue search I didn't find anything.

A technical proposal should justify its design decisions, especially when there are existing solutions. Specifically, I asked in my comment above:

  • why is a merklized schema needed when it is not needed now?
  • how is proto mapped to cbor in a way that matches or improves upon the current UX of amino JSON?

This proposal should address these points IMHO.

Copy link
Member

@webmaster128 webmaster128 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First incomplete review. Looks pretty amazing, especially knowing that this approach has been tested before 👏 Thank


## Context

## Challenges with Current sign modes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Challenges with Current sign modes
## Challenges with current sign modes

## Context

## Challenges with Current sign modes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we could add a short sentence naming what the "current" or "both" sign modes are

3. **Blind signing:**
When wallets cannot decode a transaction or smart contract call, it leads to blind signing. Blind signing exposes users to potential security risks, as they may unknowingly sign malicious transactions.

4. **Cross-language Types Inconsistency:**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
4. **Cross-language Types Inconsistency:**
4. **Cross-language type inconsistency:**


Key components of `SIGN_MODE_UNIFIED` include:

* *Metadata struct*: A JSON file containing the definitions of all data types for each message of each module, down to primitive types.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that this is referring to a JSON document, independent of its encoding (I.e. the in-memory data structure)? If so, I suggest avoiding the term "JSON file" as it indicates to something on disk, using some series of bytes. The difference matters a lot, especially as Go by default randomizes map entry order and we have a lot of genesis files where mismatch and confusion is caused by different encodings of the same document.

@@ -0,0 +1,327 @@
# ADR 075: Sign Mode Unified

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this sign mode unify? I think both Amino JSON and Textual are designed to be self-descriptive documents. The big selling point of this is using on-chain state to mitigate the limitations of the other approach. I would use a great new name for great new tech.

Key components of `SIGN_MODE_UNIFIED` include:

* *Metadata struct*: A JSON file containing the definitions of all data types for each message of each module, down to primitive types.
* *Metadata Digest*: A compact representation of the complete metadata set used for transaction construction. When hashed (digest hash), this struct serves as the root of the merkle tree, enabling efficient verification of metadata integrity.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A "digest" is usually some sort of summary or even the hash which I think is misleading here. What we have here is a serialization of the document above. I think "Metadata Serialization" or "Metadata file" (or similar) would be better.

* Requires significant development effort to implement the new sign mode, including updates to multiple components such as transaction builders, signers, and validators and HW wallet apps.

* **Tooling and Documentation Updates:**
* Requires updates to existing tools, documentation, and developer guides to accommodate the new sign mode, which may involve additional resources and time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a show stopper but I think it's worth mentioning that the sign doc is not human readable anymore

@raynaudoe
Copy link
Contributor Author

Thanks everyone for your inputs and comments. Will address them and come back with some modifications.

Main key points to analyze are:

  • Motivations for deprecating textual and amino sign modes
  • SignDoc human readability: analyze pros and cons. Plain json + spec? Other codecs alternatives?
  • Multisig support

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants