Skip to content

Commit

Permalink
Add mapping key identification to format
Browse files Browse the repository at this point in the history
  • Loading branch information
haltman-at committed Aug 4, 2023
1 parent 6fe5a11 commit 7048eb7
Showing 1 changed file with 14 additions and 0 deletions.
14 changes: 14 additions & 0 deletions docs/source/format.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,11 @@ For example, the Solidity compiler will in some cases perform a "tail-call" opti
the compiler will push the entry point of `f` as the return address for the call to `g`. The format should
help explicitly identify the targets of internal function calls and what arguments are being passed on the stack.

### Mapping key identification

EVM languages commonly include non-enumerable mappings. As such, it is useful to be able to dynamically identify any mapping keys that may appear
while analyzing a transaction trace or debugging.

## The Format

The format will be JSON so that it may be included in the standard input/output APIs that the Vyper and Solidity compilers support.
Expand Down Expand Up @@ -229,6 +234,7 @@ is itself a dictionary that (optionally) includes some of the following:
* The AST ID(s) that "correspond" to the opcode
* The layout of the stack, including type information and local variable names (if available)
* Jump target information (if available/applicable)
* Identification of mapping key information

In the above "correspond" roughly means "what source code caused the generation of this opcode".

Expand All @@ -238,6 +244,7 @@ that contributed to the generation of this opcode.
* `ast`: A list of AST ids for the "closest" AST node that contributed to the generation of this opcode.
* `stack` A layout of the stack as understood by the compiler, represented as a list.
* `jumps`: If present, provides hints about the location being jumped to by a jumping command (JUMP or JUMPI)
* `mappings`: If preent, contains information about how the opcode relates to mapping keys.

#### Source Locations

Expand Down Expand Up @@ -321,3 +328,10 @@ If the value of `sort` is `"return"`, then the dictionary has the following fiel
* `returns`: A list of dictionaries with the same format of as the `arguments` array of `call`, but without any `return_address` entries.

**Discussion**: The above proposal doesn't really handle the case of "tail-calls" identified at the beginning of this document, where multiple return addresses can be pushed onto the stack. Is that something debug format must explicitly model?

#### Mapping key identification

The value of this field (when present) is a dictionary with (some of) the following fields:
* `isMappingHash`: A boolean that identifies whether the opcode is computing a hash for a mapping.
* `isMappingPreHash`: For mappings that use two hashes, this boolean can identify whether the opcode is computing the first of the two hashes. Possibly this field should be combined with a previous one into some sort of enum?
* `mappingHashFormat`: An enumeration; specifies the format of what gets hashed for the mapping. Formats could include "prefix" (for Solidity), "postfix" (for Vyper value types), and "postfix-prehashed" (for Vyper strings and bytestrings). Possibly "prefix" could be split further into "prefix-padded" (for Solidity value types) and "prefix-unpadded" (for Solidity strings and bytestrings). This could be expanded in the future if necessary.

0 comments on commit 7048eb7

Please sign in to comment.