From df24103dadeafb9a22b01a9e0e037400d14dbd95 Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Tue, 30 Jan 2024 12:30:58 +0100 Subject: [PATCH 01/21] Update EIP-3540 to current EOF Megaspec --- EIPS/eip-3540.md | 233 +++++++++++++++++++---------------------------- 1 file changed, 96 insertions(+), 137 deletions(-) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index a42272d41bdb0..f91f2ddfcff58 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -18,7 +18,7 @@ We introduce an extensible and versioned container format for the EVM with a onc To summarise, EOF bytecode has the following layout: ``` -magic, version, (section_kind, section_size)+, 0,
+magic, version, (section_kind,
)+, 0,
``` ## Motivation @@ -47,44 +47,40 @@ In order to guarantee that every EOF-formatted contract in the state is valid, w ### Remarks -The *initcode* is the code executed in the context of the *create* transaction, `CREATE`, or `CREATE2` instructions. The *initcode* returns *code* (via the `RETURN` instruction), which is inserted into the account. See section 7 ("Contract Creation") in the Yellow Paper for more information. +If code starts with the `MAGIC`, it is considered to be EOF formatted, otherwise it is considered to be *legacy* code. For clarity, the `MAGIC` together with a version number *n* is denoted as the *EOFn prefix*, e.g. *EOF1 prefix*. -The opcode `0xEF` is currently an undefined instruction, therefore: *It pops no stack items and pushes no stack items, and it causes an exceptional abort when executed.* This means *initcode* or already deployed *code* starting with this instruction will continue to abort execution. +EOF-formatted contracts are created using new instructions `CREATE3`, `CREATE4` and `RETURNCONTRACT`. They are introduced in a separate EIP-???? (TBD, refer to the [Megaspec doc](https://github.com/ipsilon/eof/blob/main/spec/eof.md) until then). + +The opcode `0xEF` is currently an undefined instruction, therefore: *It pops no stack items and pushes no stack items, and it causes an exceptional abort when executed.* This means legacy *initcode* or already deployed legacy *code* starting with this instruction will continue to abort execution. Unless otherwised specified, all integers are encoded in big-endian byte order. ### Code validation -We introduce *code validation* for new contract creation. To achieve this, we define a format called EVM Object Format (EOF), containing a version indicator, and a ruleset of validity tied to a given version. - -At `block.number == HF_BLOCK` new contract creation is modified: - -- if *initcode* or *code* starts with the `MAGIC`, it is considered to be EOF formatted and will undergo validation specified in the following sections, -- else if *code* starts with `0xEF`, creation continues to result in an exceptional abort (the rule introduced in EIP-3541), -- otherwise code is considered *legacy code* and the following rules do not apply to it. +We introduce *code validation* for new contract creation, which EOF formatted code will undergo. To achieve this, we define a format called EVM Object Format (EOF), containing a version indicator, and a ruleset of validity tied to a given version. -For a create transaction, if *initcode* or *code* is invalid, the contract creation results in an exceptional abort. Such a transaction is valid and may be included in a block. Therefore, the transaction sender's nonce is increased. +Legacy code is not affected by EOF code validation. -For the `CREATE` and `CREATE2` instructions, if *initcode* or *code* is invalid, instructions' execution ends with the result `0` pushed on stack. The *initcode* validation happens just before its execution and validation failure is observable as if execution results in an exceptional abort. I.e. in case *initcode* or returned *code* is invalid the caller's nonce remains increased and all creation gas is deducted. +Code validation is performed during `CREATE4` instruction, and is elaborated on in [EIP-3670](./eip-3670.md) and an upcomming contract creation EIP-???? (TBD, refer to the [Megaspec doc](https://github.com/ipsilon/eof/blob/main/spec/eof.md) until then) +The EOF format itself and its formal validation are described in the following sections. ### Container specification EOF container is a binary format with the capability of providing the EOF version number and a list of EOF sections. -The container starts with the EOF header: +The container starts with the EOF prefix: | description | length | value | | |-------------|----------|------------|--------------------| | magic | 2-bytes | 0xEF00 | | | version | 1-byte | 0x01–0xFF | EOF version number | -The EOF header is followed by at least one section header. Each section header contains two fields, `section_kind` and either `section_size` or `section_size_list`, depending on the kind. `section_size_list` is a list of size values when multiple sections of this kind are allowed. +The EOF prefix is followed by at least one section header. Each section header contains two fields, `section_kind` and bytes describing the section of structure defined specifically for each kind. -| description | length | value | | -|-------------------|---------|---------------|-------------------| -| section_kind | 1-byte | 0x01–0xFF | `uint8` | -| section_size | 2-bytes | 0x0000–0xFFFF | `uint16` | -| section_size_list | dynamic | n/a | `uint16, uint16+` | +| description | length | value | | +|---------------------|---------|---------------|-------------------| +| section_kind | 1-byte | 0x01–0xFF | `uint8` | +| section description | dynamic | n/a | `uint8+` | The list of section headers is terminated with the *section headers terminator byte* `0x00`. The body content follows immediately after. @@ -93,88 +89,94 @@ The list of section headers is terminated with the *section headers terminator b 1. `version` MUST NOT be `0`.[^1](#eof-version-range-start-with-1) 2. `section_kind` MUST NOT be `0`. The value `0` is reserved for *section headers terminator byte*. 3. There MUST be at least one section (and therefore section header). -4. Section content size MUST be equal to size declared in its header. 5. Stray bytes outside of sections MUST NOT be present. This includes trailing bytes after the last section. ### EOF version 1 -EOF version 1 is made up of 5 EIPs, including this one: [EIP-3540](./eip-3540.md), [EIP-3670](./eip-3670.md), [EIP-4200](./eip-4200.md), [EIP-4750](./eip-4750.md), and [EIP-5450](./eip-5450.md). Some values in this specification are only discussed briefly. To understand the full scope of EOF, it is necessary to review each EIP in-depth. +EOF version 1 is made up of several EIPs, including this one, as enumerated in the EOF Meta EIP-???? (TBD). Some values in this specification are only discussed briefly. To understand the full scope of EOF, it is necessary to review each EIP in-depth. #### Container The EOF version 1 container consists of a `header` and `body`. ``` -container := header, body -header := magic, version, kind_type, type_size, kind_code, num_code_sections, code_size+, kind_data, data_size, terminator -body := type_section, code_section+, data_section -type_section := (inputs, outputs, max_stack_height)+ +container := header, body +header := + magic, version, + kind_types, types_size, + kind_code, num_code_sections, code_size+, + [kind_container, num_container_sections, container_size+,] + kind_data, data_size, + terminator +body := types_section, code_section+, container_section*, data_section +types_section := (inputs, outputs, max_stack_height)+ ``` -*note: `,` is a concatenation operator and `+` should be interpreted as "one or more" of the preceding item* - -##### Header - -| name | length | value | description | -|-------------------|----------|---------------|------------------------------------------------------------------------------------| -| magic | 2 bytes | 0xEF00 | EOF prefix | -| version | 1 byte | 0x01 | EOF version | -| kind_type | 1 byte | 0x01 | kind marker for EIP-4750 type section header | -| type_size | 2 bytes | 0x0004-0xFFFC | uint16 denoting the length of the type section content, 4 bytes per code segment | -| kind_code | 1 byte | 0x02 | kind marker for code size section | -| num_code_sections | 2 bytes | 0x0001-0x0400 | uint16 denoting the number of the code sections | -| code_size | 2 bytes | 0x0001-0xFFFF | uint16 denoting the length of the code section content | -| kind_data | 1 byte | 0x03 | kind marker for data size section | -| data_size | 2 bytes | 0x0000-0xFFFF | uint16 integer denoting the length of the data section content | -| terminator | 1 byte | 0x00 | marks the end of the header | - -##### Body - -| name | length | value | description | -|------------------|----------|--------------|----------------------------------------------------| -| type_section | variable | n/a | stores EIP-4750 and EIP-5450 code section metadata | -| inputs | 1 byte | 0x00-0x7F | number of stack elements the code section consumes | -| outputs | 1 byte | 0x00-0x7F | number of stack elements the code section returns | -| max_stack_height | 2 bytes | 0x0000-0x3FF | max height of operand stack during execution | -| code_section | variable | n/a | arbitrary bytecode | -| data_section | variable | n/a | arbitrary sequence of bytes | +_note: `,` is a concatenation operator, `+` should be interpreted as "one or more" of the preceding item, and `*` should be interpreted as "zero or more" of the preceding item._ + +#### Header + +| name | length | value | description | +|------------------------|----------|---------------|-----------------------------------------------------------------------------------------| +| magic | 2 bytes | 0xEF00 | | +| version | 1 byte | 0x01 | EOF version | +| kind_types | 1 byte | 0x01 | kind marker for types size section | +| types_size | 2 bytes | 0x0004-0xFFFF | 16-bit unsigned big-endian integer denoting the length of the type section content | +| kind_code | 1 byte | 0x02 | kind marker for code size section | +| num_code_sections | 2 bytes | 0x0001-0xFFFF | 16-bit unsigned big-endian integer denoting the number of the code sections | +| code_size | 2 bytes | 0x0001-0xFFFF | 16-bit unsigned big-endian integer denoting the length of the code section content | +| kind_container | 1 byte | 0x03 | kind marker for container size section | +| num_container_sections | 2 bytes | 0x0001-0x00FF | 16-bit unsigned big-endian integer denoting the number of the container sections | +| container_size | 2 bytes | 0x0001-0xFFFF | 16-bit unsigned big-endian integer denoting the length of the container section content | +| kind_data | 1 byte | 0x04 | kind marker for data size section | +| data_size | 2 bytes | 0x0000-0xFFFF | 16-bit unsigned big-endian integer denoting the length of the data section content (*) | +| terminator | 1 byte | 0x00 | marks the end of the header | + +(*) For not yet deployed containers this can be more than the actual content, see Data Section Lifecycle section in the contract creation EIP-???? (TBD + link). + +#### Body + +| name | length | value | description | +|-------------------|----------|---------------|---------------------------------------------------------------------------------------| +| types_section | variable | n/a | stores code section metadata | +| inputs | 1 byte | 0x00-0x7F | number of stack elements the code section consumes | +| outputs | 1 byte | 0x00-0x80 | number of stack elements the code section returns or 0x80 for non-returning functions | +| max_stack_height | 2 bytes | 0x0000-0x03FF | maximum number of elements ever placed onto the stack by the code section | +| code_section | variable | n/a | arbitrary sequence of bytes | +| container_section | variable | n/a | arbitrary sequence of bytes | +| data_section | variable | n/a | arbitrary sequence of bytes | See [EIP-4750](./eip-4750.md) for more information on the type section content. +See [EIP-6206](./eip-6206.md) for more information on non-returning functions. #### EOF version 1 validation rules -1. In addition to general validation rules above, EOF version 1 bytecode conforms to the rules specified below: - - Exactly one type section header MUST be present immediately following the EOF version. Each code section MUST have a specified type signature in the type body. - - Exactly one code section header MUST be present immediately following the type section. A maximum of 1024 individual code sections are allowed. - - Exactly one data section header MUST be present immediately following the code section. -2. Any version other than `0x01` is invalid. +The following validity constraints are placed on the container format: -(*Remark:* Contract creation code SHOULD set the section size of the data section so that the constructor arguments fit it.) +- minimum valid header size is `15` bytes +- `version` must be `0x01` +- `types_size` is divisible by `4` +- the number of code sections must be equal to `types_size / 4` +- the number of code sections must not exceed 1024 +- `code_size` may not be 0 +- the number of container sections must not exceed 256 +- `container_size` may not be 0, but container sections are optional +- the total size of a deployed container without container sections must be `13 + 2*num_code_sections + types_size + code_size[0] + ... + code_size[num_code_sections-1] + data_size` +- the total size of a deployed container with at least one container section must be `16 + 2*num_code_sections + types_size + code_size[0] + ... + code_size[num_code_sections-1] + data_size + 2*num_container_sections + container_size[0] + ... + container_size[num_container_sections-1]` +- the total size of not yet deployed container might be up to `data_size` lower than the above values due to how the data section is rewritten and resized during creation ### Changes to execution semantics -For clarity, the *container* refers to the complete account code, while *code* refers to the contents of the code section only. - -1. Execution starts at the first byte of the first code section, and PC is set to 0. -2. Execution stops if `PC` goes outside the code section bounds. -3. `PC` returns the current position within the *code*. -4. `CODECOPY`/`CODESIZE`/`EXTCODECOPY`/`EXTCODESIZE`/`EXTCODEHASH` keeps operating on the entire *container*. -5. The input to `CREATE`/`CREATE2` is still the entire *container*. -6. The size limit for deployed code as specified in [EIP-170](./eip-170.md) and for initcode as specified in [EIP-3860](./eip-3860.md) is applied to the entire *container* size, not to the *code* size. This also means if initcode validation fails, it is still charged the EIP-3860 `initcode_cost`. -7. When an EOF1 contract performs a `DELEGATECALL` the target must be EOF1. If it is not EOF1, the `DELEGATECALL` execution finishes as a failed call by pushing `0` to the stack. Only initial gas cost of `DELEGATECALL` is consumed (similarly to the call depth check) and the target address still becomes warm. - -(*Remark:* Due to [EIP-4750](./eip-4750.md), `JUMP` and `JUMPI` are disabled and therefore are not discussed in relation to EOF.) +For an EOF contract: +- Execution starts at the first byte of code section 0, and `pc` is set to 0. +- `pc` is scoped to the executing code section +- `CODESIZE`, `CODECOPY`, `EXTCODESIZE`, `EXTCODECOPY`, `EXTCODEHASH`, `GAS` are rejected by validation in EOF contracts, with no replacements +- `DELEGATECALL2` (`DELEGATECALL` replacement) from an EOF contract to a legacy contract is disallowed, and it returns 0 to signal failure. Only initial gas cost of `DELEGATECALL2` is consumed (similarly to the call depth check) and the target address still becomes warm. We allow legacy to EOF path for existing proxy contracts to be able to use EOF upgrades. -### Changes to contract creation semantics - -For clarity, the *EOF prefix* together with a version number *n* is denoted as the *EOFn prefix*, e.g. *EOF1 prefix*. - -1. If *initcode's container* has EOF1 prefix it MUST be valid EOF1 code. -2. If *code's container* has EOF1 prefix it MUST be valid EOF1 code. -3. If *initcode's container* is valid EOF1 code the resulting *code's container* MUST be valid EOF1 code (i.e. it MUST NOT be empty and MUST NOT produce legacy code). -4. If `CREATE` or `CREATE2` instruction is executed in an EOF1 code the instruction's *initcode* MUST be valid EOF1 code (i.e. EOF1 contracts MUST NOT produce legacy code). - -See [Code validation](#code-validation) above for specification of behaviour in case one of these conditions is not satisfied. +For a legacy contract: +- If the target account of `EXTCODECOPY` is an EOF contract, then it will copy 0 bytes. +- If the target account of `EXTCODEHASH` is an EOF contract, then it will return `0x9dbf3648db8210552e9c4f75c6a1c3057c0ca432043bd648be15fe7be05646f5` (the hash of `EF00`, as if that would be the code). +- If the target account of `EXTCODESIZE` is an EOF contract, then it will return 2. ## Rationale @@ -197,21 +199,6 @@ The alternative is to have execution time validation for EOF. This is performed - Can be enabled via a single hard-fork. - Better backwards compatibility: data contracts starting with the `0xEF` byte or the *EOF prefix* can be deployed. This is a dubious benefit, however. -### Contract creation restrictions - -The [Changes to contact creation semantics](#changes-to-contract-creation-semantics) section defines -minimal set of restrictions related to the contract creation: if *initcode* or *code* has the EOF1 -container prefix it must be validated. This adds two validation steps in the contract creation, -any of it failing will result in contract creation failure. - -Moreover, it is not allowed to create legacy contracts from EOF1 ones. And the EOF version of *initcode* must match the EOF version of the produced *code*. -The rule can be generalized in the future: EOFn contract must only create EOFm contracts, where m ≥ n. - -This guarantees that a cluster of EOF contracts will never spawn new legacy contracts. -Furthermore, some exotic contract creation combinations are eliminated (e.g. EOF1 contract creating new EOF1 contract with legacy *initcode*). - -Finally, create transaction must be allowed to contain legacy *initcode* and deploy legacy *code* because otherwise there is no transition period allowing upgrading transaction signing tools. Deprecating such transactions may be considered in the future. - ### The MAGIC 1. The first byte `0xEF` was chosen because it is reserved for this purpose by [EIP-3541](./eip-3541.md). @@ -234,21 +221,16 @@ We have considered different questions for the sections: - Streaming headers (i.e. `section_header, section_data, section_header, section_data, ...`) are used in some other formats (such as WebAssembly). They are handy for formats which are subject to editing (adding/removing sections). That is not a useful feature for EVM. One minor benefit applicable to our case is that they do not require a specific "header terminator". On the other hand they seem to play worse with code chunking / merkleization, as it is better to have all section headers in a single chunk. - Whether to have a header terminator or to encode `number_of_sections` or `total_size_of_headers`. Both raise the question of how large of a value these fields should be able to hold. A terminator byte seems to avoid the problem of choosing a size which is too small without any perceptible downside, so it is the path taken. -- Whether to encode `section_size` as a fixed 16-bit value or some kind of variable length field (e.g. LEB128). We have opted for fixed size, because it simplifies client implementations, and 16-bit seems enough, because of the currently exposed code size limit of 24576 bytes (see [EIP-170](./eip-170.md) and [EIP-3860](./eip-3860.md)). Should this be limiting in the future, a new EOF version could change the format. Besides simplifying client implementations, not using LEB128 also greatly simplifies on-chain parsing. +- (EOF1) Whether to encode section sizes as fixed 16-bit values or some kind of variable length field (e.g. LEB128). We have opted for fixed size, because it simplifies client implementations, and 16-bit seems enough, because of the currently exposed code size limit of 24576 bytes (see [EIP-170](./eip-170.md) and [EIP-3860](./eip-3860.md)). Should this be limiting in the future, a new EOF version could change the format. Besides simplifying client implementations, not using LEB128 also greatly simplifies on-chain parsing. +- Whether or not to have more structure to the container header for all EOF versions to follow. In order to allow future formats optimized for chunking and merkleization (verkleization) it was decided to keep it generic and specify the structure only for a specific EOF version. ### Data-only contracts -The EOF prevents deploying contracts with arbitrary bytes (data-only contracts: their purpose is to store data not execution). **EOF1 requires** presence of a **code section** therefore the minimal overhead EOF data contract consist of a data section and one code section with single instruction. We recommend to use `INVALID` instruction in this case. In total there are 20 additional bytes required. +(TBD: Moved to an upcoming section in the [EIP-7480](./eip-7480.md)) -``` -EF0001 010004 020001 0001 03 00 00000000 FE -``` - -It is possible in the future that this data will be accessible with data-specific opcodes, such as `DATACOPY` or `EXTDATACOPY`. Until then, callers will need to determine the data offset manually. +### `pc` starts with 0 at the code section -### PC starts with 0 at the code section - -The value for `PC` is specified to start at 0 and to be within the active *code* section. We considered keeping `PC` to operate on the whole *container* and be consistent with `CODECOPY`/`EXTCODECOPY` but in the end decided otherwise. This also feels more natural and easier to implement in EVM: the new EOF EVM should only care about traversing *code* and accessing other parts of the *container* only on special occasions (e.g. in `CODECOPY` instruction). +The value for `pc` is specified to start at 0 and to be within the active *code* section. An alternative was keeping `pc` to operate on the whole *container*. However, the new EOF EVM should only care about traversing *code*. ### EOF1 contracts can only `DELEGATECALL` EOF1 contracts @@ -262,43 +244,20 @@ The choice of `MAGIC` guarantees that none of the contracts existing on the chai ## Test Cases -### Contract creation - -All cases should be checked for creation transaction, `CREATE` and `CREATE2`. +### Container validation rules -- Legacy init code - - Returns legacy code - - Returns valid EOF1 code - - Returns invalid EOF1 code, contract creation fails - - Returns 0xEF not followed by EOF1 code, contract creation fails -- Valid EOF1 init code - - Returns legacy code, contract creation fails - - Returns valid EOF1 code - - Returns invalid EOF1 code, contract creation fails - - Returns 0xEF not followed by EOF1 code, contract creation fails -- Invalid EOF1 init code +(TBD) ### Contract execution -- EOF code containing `PC` opcode - offset inside code section is returned -- EOF code containing `CODECOPY/CODESIZE` - works as in legacy code - - `CODESIZE` returns the size of entire container - - `CODECOPY` can copy from code section - - `CODECOPY` can copy from data section - - `CODECOPY` can copy from the EOF header - - `CODECOPY` can copy entire container -- `EXTCODECOPY/EXTCODESIZE/EXTCODEHASH` with the EOF *target* contract - works as with legacy target contract - - `EXTCODESIZE` returns the size of entire target container - - `EXTCODEHASH` returns the hash of entire target container - - `EXTCODECOPY` can copy from target's code section - - `EXTCODECOPY` can copy from target's data section - - `EXTCODECOPY` can copy from target's EOF header - - `EXTCODECOPY` can copy entire target container - - Results don't differ when executed inside legacy or EOF contract -- EOF1 `DELEGATECALL` - - `DELEGATECALL` to EOF1 code succeeds - - `DELEGATECALL` to EOF0 code fails - - `DELEGATECALL` to empty container fails +- EOF0 `EXTCODECOPY` to an EOF1 contract copies 0 bytes +- EOF0 `EXTCODEHASH` to an EOF1 contract returns hash of `0xEF00` +- EOF0 `EXTCODESIZE` to an EOF1 contract returns 2 +- EOF0 `DELEGATECALL2` to EOF1 code succeeds +- EOF1 `DELEGATECALL2` + - to EOF1 code succeeds + - to EOF0 code fails + - to empty container fails ## Security Considerations From 73d5d13b0c9aca7e700868e8e524c643aa1817d6 Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Mon, 5 Feb 2024 21:31:48 +0100 Subject: [PATCH 02/21] Apply suggestions from code review Co-authored-by: Andrei Maiboroda --- EIPS/eip-3540.md | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index f91f2ddfcff58..4e696f9564e29 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -57,11 +57,11 @@ Unless otherwised specified, all integers are encoded in big-endian byte order. ### Code validation -We introduce *code validation* for new contract creation, which EOF formatted code will undergo. To achieve this, we define a format called EVM Object Format (EOF), containing a version indicator, and a ruleset of validity tied to a given version. +We introduce *code validation* for new contract creation. To achieve this, we define a format called EVM Object Format (EOF), containing a version indicator, and a ruleset of validity tied to a given version. Legacy code is not affected by EOF code validation. -Code validation is performed during `CREATE4` instruction, and is elaborated on in [EIP-3670](./eip-3670.md) and an upcomming contract creation EIP-???? (TBD, refer to the [Megaspec doc](https://github.com/ipsilon/eof/blob/main/spec/eof.md) until then) +Code validation is performed during `CREATE4` instruction, and is elaborated on in [EIP-3670](./eip-3670.md) and an upcoming contract creation EIP-???? (TBD, refer to the [Megaspec doc](https://github.com/ipsilon/eof/blob/main/spec/eof.md) until then) The EOF format itself and its formal validation are described in the following sections. ### Container specification @@ -112,7 +112,7 @@ body := types_section, code_section+, container_section*, data_section types_section := (inputs, outputs, max_stack_height)+ ``` -_note: `,` is a concatenation operator, `+` should be interpreted as "one or more" of the preceding item, and `*` should be interpreted as "zero or more" of the preceding item._ +*note: `,` is a concatenation operator, `+` should be interpreted as "one or more" of the preceding item, `*` should be interpreted as "zero or more" of the preceding item, and `[item]` should be interpeted as an optional item.* #### Header @@ -120,10 +120,10 @@ _note: `,` is a concatenation operator, `+` should be interpreted as "one or mor |------------------------|----------|---------------|-----------------------------------------------------------------------------------------| | magic | 2 bytes | 0xEF00 | | | version | 1 byte | 0x01 | EOF version | -| kind_types | 1 byte | 0x01 | kind marker for types size section | -| types_size | 2 bytes | 0x0004-0xFFFF | 16-bit unsigned big-endian integer denoting the length of the type section content | +| kind_type | 1 byte | 0x01 | kind marker for type section | +| type_size | 2 bytes | 0x0004-0x1000 | 16-bit unsigned big-endian integer denoting the length of the type section content, 4 bytes per code section | | kind_code | 1 byte | 0x02 | kind marker for code size section | -| num_code_sections | 2 bytes | 0x0001-0xFFFF | 16-bit unsigned big-endian integer denoting the number of the code sections | +| num_code_sections | 2 bytes | 0x0001-0x0400 | 16-bit unsigned big-endian integer denoting the number of the code sections | | code_size | 2 bytes | 0x0001-0xFFFF | 16-bit unsigned big-endian integer denoting the length of the code section content | | kind_container | 1 byte | 0x03 | kind marker for container size section | | num_container_sections | 2 bytes | 0x0001-0x00FF | 16-bit unsigned big-endian integer denoting the number of the container sections | @@ -132,7 +132,7 @@ _note: `,` is a concatenation operator, `+` should be interpreted as "one or mor | data_size | 2 bytes | 0x0000-0xFFFF | 16-bit unsigned big-endian integer denoting the length of the data section content (*) | | terminator | 1 byte | 0x00 | marks the end of the header | -(*) For not yet deployed containers this can be more than the actual content, see Data Section Lifecycle section in the contract creation EIP-???? (TBD + link). +(*) For not yet deployed containers this can be greater than the actual content length, see Data Section Lifecycle section in the contract creation EIP-???? (TBD + link). #### Body @@ -141,9 +141,9 @@ _note: `,` is a concatenation operator, `+` should be interpreted as "one or mor | types_section | variable | n/a | stores code section metadata | | inputs | 1 byte | 0x00-0x7F | number of stack elements the code section consumes | | outputs | 1 byte | 0x00-0x80 | number of stack elements the code section returns or 0x80 for non-returning functions | -| max_stack_height | 2 bytes | 0x0000-0x03FF | maximum number of elements ever placed onto the stack by the code section | -| code_section | variable | n/a | arbitrary sequence of bytes | -| container_section | variable | n/a | arbitrary sequence of bytes | +| max_stack_height | 2 bytes | 0x0000-0x03FF | maximum number of elements ever placed onto the operand stack by the code section | +| code_section | variable | n/a | arbitrary bytecode | +| container_section | variable | n/a | arbitrary EOF-formatted container | | data_section | variable | n/a | arbitrary sequence of bytes | See [EIP-4750](./eip-4750.md) for more information on the type section content. @@ -153,7 +153,6 @@ See [EIP-6206](./eip-6206.md) for more information on non-returning functions. The following validity constraints are placed on the container format: -- minimum valid header size is `15` bytes - `version` must be `0x01` - `types_size` is divisible by `4` - the number of code sections must be equal to `types_size / 4` @@ -161,19 +160,20 @@ The following validity constraints are placed on the container format: - `code_size` may not be 0 - the number of container sections must not exceed 256 - `container_size` may not be 0, but container sections are optional -- the total size of a deployed container without container sections must be `13 + 2*num_code_sections + types_size + code_size[0] + ... + code_size[num_code_sections-1] + data_size` -- the total size of a deployed container with at least one container section must be `16 + 2*num_code_sections + types_size + code_size[0] + ... + code_size[num_code_sections-1] + data_size + 2*num_container_sections + container_size[0] + ... + container_size[num_container_sections-1]` -- the total size of not yet deployed container might be up to `data_size` lower than the above values due to how the data section is rewritten and resized during creation +- data section is mandatory, but `data_size` may be 0 +- data body length may be shorter than `data_size` for a not yet deployed container ### Changes to execution semantics For an EOF contract: + - Execution starts at the first byte of code section 0, and `pc` is set to 0. - `pc` is scoped to the executing code section - `CODESIZE`, `CODECOPY`, `EXTCODESIZE`, `EXTCODECOPY`, `EXTCODEHASH`, `GAS` are rejected by validation in EOF contracts, with no replacements - `DELEGATECALL2` (`DELEGATECALL` replacement) from an EOF contract to a legacy contract is disallowed, and it returns 0 to signal failure. Only initial gas cost of `DELEGATECALL2` is consumed (similarly to the call depth check) and the target address still becomes warm. We allow legacy to EOF path for existing proxy contracts to be able to use EOF upgrades. For a legacy contract: + - If the target account of `EXTCODECOPY` is an EOF contract, then it will copy 0 bytes. - If the target account of `EXTCODEHASH` is an EOF contract, then it will return `0x9dbf3648db8210552e9c4f75c6a1c3057c0ca432043bd648be15fe7be05646f5` (the hash of `EF00`, as if that would be the code). - If the target account of `EXTCODESIZE` is an EOF contract, then it will return 2. From 4524c5b79594b903734805c51a796d817b419fd9 Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Tue, 6 Feb 2024 12:49:07 +0100 Subject: [PATCH 03/21] Update EXTCODECOPY to EOF contract to copy `EF00` --- EIPS/eip-3540.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index 4e696f9564e29..e8aa617002e77 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -174,7 +174,7 @@ For an EOF contract: For a legacy contract: -- If the target account of `EXTCODECOPY` is an EOF contract, then it will copy 0 bytes. +- If the target account of `EXTCODECOPY` is an EOF contract, then it will copy up to 2 bytes from `EF00`, as if that would be the code. - If the target account of `EXTCODEHASH` is an EOF contract, then it will return `0x9dbf3648db8210552e9c4f75c6a1c3057c0ca432043bd648be15fe7be05646f5` (the hash of `EF00`, as if that would be the code). - If the target account of `EXTCODESIZE` is an EOF contract, then it will return 2. From ba9c55deb13ab9804762361896c847dcbda4b07e Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Thu, 8 Feb 2024 18:50:40 +0100 Subject: [PATCH 04/21] Explicitly clear that "faux-EOF" must not be there --- EIPS/eip-3540.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index e8aa617002e77..f02930404901d 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -210,6 +210,8 @@ The alternative is to have execution time validation for EOF. This is performed 3. No contracts starting with `0xEF` bytes exist on public testnets: Goerli, Ropsten, Rinkeby, Kovan and Sepolia at their London fork block. +**NOTE**: EOF MUST NOT be enabled on chains which contain bytecodes starting with MAGIC and not being valid EOF. + ### EOF version range start with 1 The version number 0 will never be used in EOF, so we can call legacy code *EOF0*. From b13e222b6f2223d1c219e674648ce20f96982ce1 Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Tue, 13 Feb 2024 11:30:49 +0100 Subject: [PATCH 05/21] Update with the contract creation EIP-7620 references --- EIPS/eip-3540.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index f02930404901d..b5d9327492f66 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -4,11 +4,11 @@ title: EOF - EVM Object Format v1 description: EOF is an extensible and versioned container format for EVM bytecode with a once-off validation at deploy time. author: Alex Beregszaszi (@axic), Paweł Bylica (@chfast), Andrei Maiboroda (@gumb0), Matt Garnett (@lightclient) discussions-to: https://ethereum-magicians.org/t/evm-object-format-eof/5727 -status: Stagnant +status: Review type: Standards Track category: Core created: 2021-03-16 -requires: 3541, 3860, 4750, 5450 +requires: 3541, 3860, 4750, 5450, 7620 --- ## Abstract @@ -49,7 +49,7 @@ In order to guarantee that every EOF-formatted contract in the state is valid, w If code starts with the `MAGIC`, it is considered to be EOF formatted, otherwise it is considered to be *legacy* code. For clarity, the `MAGIC` together with a version number *n* is denoted as the *EOFn prefix*, e.g. *EOF1 prefix*. -EOF-formatted contracts are created using new instructions `CREATE3`, `CREATE4` and `RETURNCONTRACT`. They are introduced in a separate EIP-???? (TBD, refer to the [Megaspec doc](https://github.com/ipsilon/eof/blob/main/spec/eof.md) until then). +EOF-formatted contracts are created using new instructions `CREATE3`, `CREATE4` and `RETURNCONTRACT`. They are introduced in a separate [EIP-7620](./eip-7620.md). The opcode `0xEF` is currently an undefined instruction, therefore: *It pops no stack items and pushes no stack items, and it causes an exceptional abort when executed.* This means legacy *initcode* or already deployed legacy *code* starting with this instruction will continue to abort execution. @@ -61,7 +61,7 @@ We introduce *code validation* for new contract creation. To achieve this, we de Legacy code is not affected by EOF code validation. -Code validation is performed during `CREATE4` instruction, and is elaborated on in [EIP-3670](./eip-3670.md) and an upcoming contract creation EIP-???? (TBD, refer to the [Megaspec doc](https://github.com/ipsilon/eof/blob/main/spec/eof.md) until then) +Code validation is performed during `CREATE4` instruction, and is elaborated on in [EIP-3670](./eip-3670.md) and the contract creation [EIP-7620](./eip-7620.md). The EOF format itself and its formal validation are described in the following sections. ### Container specification @@ -132,7 +132,7 @@ types_section := (inputs, outputs, max_stack_height)+ | data_size | 2 bytes | 0x0000-0xFFFF | 16-bit unsigned big-endian integer denoting the length of the data section content (*) | | terminator | 1 byte | 0x00 | marks the end of the header | -(*) For not yet deployed containers this can be greater than the actual content length, see Data Section Lifecycle section in the contract creation EIP-???? (TBD + link). +(*) For not yet deployed containers this can be greater than the actual content length, see Data Section Lifecycle section in the contract creation [EIP-7620](./eip-7620.md#data-section-lifecycle). #### Body From a7da5b4ecc6516105b0645fb759c2f32d8e6161f Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Tue, 13 Feb 2024 17:20:32 +0100 Subject: [PATCH 06/21] Apply suggestions from code review Co-authored-by: Andrei Maiboroda --- EIPS/eip-3540.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index b5d9327492f66..470a9faa7732c 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -103,7 +103,7 @@ The EOF version 1 container consists of a `header` and `body`. container := header, body header := magic, version, - kind_types, types_size, + kind_type, type_size, kind_code, num_code_sections, code_size+, [kind_container, num_container_sections, container_size+,] kind_data, data_size, @@ -156,11 +156,11 @@ The following validity constraints are placed on the container format: - `version` must be `0x01` - `types_size` is divisible by `4` - the number of code sections must be equal to `types_size / 4` -- the number of code sections must not exceed 1024 -- `code_size` may not be 0 -- the number of container sections must not exceed 256 -- `container_size` may not be 0, but container sections are optional -- data section is mandatory, but `data_size` may be 0 +- the number of code sections must not exceed `1024` +- `code_size` may not be `0` +- the number of container sections must not exceed `256` +- `container_size` may not be `0`, but container sections are optional +- data section is mandatory, but `data_size` may be `0` - data body length may be shorter than `data_size` for a not yet deployed container ### Changes to execution semantics From 6e890562b2ec05da21aa4bec5637b38a7873016f Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Tue, 13 Feb 2024 17:22:11 +0100 Subject: [PATCH 07/21] Uppercase PC --- EIPS/eip-3540.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index 470a9faa7732c..2f99119a7aa95 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -167,8 +167,8 @@ The following validity constraints are placed on the container format: For an EOF contract: -- Execution starts at the first byte of code section 0, and `pc` is set to 0. -- `pc` is scoped to the executing code section +- Execution starts at the first byte of code section 0, and PC is set to 0. +- `PC` is scoped to the executing code section - `CODESIZE`, `CODECOPY`, `EXTCODESIZE`, `EXTCODECOPY`, `EXTCODEHASH`, `GAS` are rejected by validation in EOF contracts, with no replacements - `DELEGATECALL2` (`DELEGATECALL` replacement) from an EOF contract to a legacy contract is disallowed, and it returns 0 to signal failure. Only initial gas cost of `DELEGATECALL2` is consumed (similarly to the call depth check) and the target address still becomes warm. We allow legacy to EOF path for existing proxy contracts to be able to use EOF upgrades. @@ -230,9 +230,9 @@ We have considered different questions for the sections: (TBD: Moved to an upcoming section in the [EIP-7480](./eip-7480.md)) -### `pc` starts with 0 at the code section +### `PC` starts with 0 at the code section -The value for `pc` is specified to start at 0 and to be within the active *code* section. An alternative was keeping `pc` to operate on the whole *container*. However, the new EOF EVM should only care about traversing *code*. +The value for `PC` is specified to start at 0 and to be within the active *code* section. An alternative was keeping `PC` to operate on the whole *container*. However, the new EOF EVM should only care about traversing *code*. ### EOF1 contracts can only `DELEGATECALL` EOF1 contracts From cc9dd2aac927b337a5712091657d2fb561027d01 Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Tue, 13 Feb 2024 17:22:45 +0100 Subject: [PATCH 08/21] Remove Test Cases section --- EIPS/eip-3540.md | 17 ----------------- 1 file changed, 17 deletions(-) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index 2f99119a7aa95..09d7a4d91083c 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -244,23 +244,6 @@ This is a breaking change given that any code starting with `0xEF` was not deplo The choice of `MAGIC` guarantees that none of the contracts existing on the chain are affected by the new rules. -## Test Cases - -### Container validation rules - -(TBD) - -### Contract execution - -- EOF0 `EXTCODECOPY` to an EOF1 contract copies 0 bytes -- EOF0 `EXTCODEHASH` to an EOF1 contract returns hash of `0xEF00` -- EOF0 `EXTCODESIZE` to an EOF1 contract returns 2 -- EOF0 `DELEGATECALL2` to EOF1 code succeeds -- EOF1 `DELEGATECALL2` - - to EOF1 code succeeds - - to EOF0 code fails - - to empty container fails - ## Security Considerations With the anticipated EOF extensions, the validation is expected to have linear computational and space complexity. From 1f2edae652b7926496c54ee6fe06e6b24e6b6915 Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Tue, 13 Feb 2024 18:08:01 +0100 Subject: [PATCH 09/21] Specify the reference to EIP-7480 --- EIPS/eip-3540.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index 09d7a4d91083c..9c612daa61c8b 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -228,7 +228,7 @@ We have considered different questions for the sections: ### Data-only contracts -(TBD: Moved to an upcoming section in the [EIP-7480](./eip-7480.md)) +Moved to the section [Lack of `EXTDATACOPY` in EIP-7480](./eip-7480#lack-of-extdatacopy.md). ### `PC` starts with 0 at the code section From 7e4c93bfc507dd84a59cb6dfe02f8a6711e84258 Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Wed, 14 Feb 2024 17:50:12 +0100 Subject: [PATCH 10/21] Rollback to the stricter generic EOF header format --- EIPS/eip-3540.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index 9c612daa61c8b..8809c3b4cdcae 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -18,7 +18,7 @@ We introduce an extensible and versioned container format for the EVM with a onc To summarise, EOF bytecode has the following layout: ``` -magic, version, (section_kind,
)+, 0,
+magic, version, (section_kind, section_size_or_sizes)+, 0,
``` ## Motivation @@ -75,12 +75,13 @@ The container starts with the EOF prefix: | magic | 2-bytes | 0xEF00 | | | version | 1-byte | 0x01–0xFF | EOF version number | -The EOF prefix is followed by at least one section header. Each section header contains two fields, `section_kind` and bytes describing the section of structure defined specifically for each kind. +The EOF prefix is followed by at least one section header. Each section header contains two fields, `section_kind` and either `section_size` or `section_size_list`, depending on the kind. `section_size_list` is a list of size values when multiple sections of this kind are allowed, encoded as a count of items followed by the items. -| description | length | value | | -|---------------------|---------|---------------|-------------------| -| section_kind | 1-byte | 0x01–0xFF | `uint8` | -| section description | dynamic | n/a | `uint8+` | +| description | length | value | | +|-------------------|---------|---------------|-------------------| +| section_kind | 1-byte | 0x01–0xFF | `uint8` | +| section_size | 2-bytes | 0x0000–0xFFFF | `uint16` | +| section_size_list | dynamic | n/a | `uint16, uint16+` | The list of section headers is terminated with the *section headers terminator byte* `0x00`. The body content follows immediately after. From fda31f2bf6b249e8ab50b9550c7d8d285942deb5 Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Wed, 14 Feb 2024 19:01:09 +0100 Subject: [PATCH 11/21] Do not mention the Meta EIP which isn't ready yet --- EIPS/eip-3540.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index 8809c3b4cdcae..1e7d3700d89e9 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -94,7 +94,7 @@ The list of section headers is terminated with the *section headers terminator b ### EOF version 1 -EOF version 1 is made up of several EIPs, including this one, as enumerated in the EOF Meta EIP-???? (TBD). Some values in this specification are only discussed briefly. To understand the full scope of EOF, it is necessary to review each EIP in-depth. +EOF version 1 is made up of several EIPs, including this one. Some values in this specification are only discussed briefly. To understand the full scope of EOF, it is necessary to review each EIP in-depth. #### Container From f8c65c2f51e7b05b75eabc9ec5f1642f02a4b41a Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Tue, 20 Feb 2024 08:17:57 +0100 Subject: [PATCH 12/21] Update EIPS/eip-3540.md - fix 7480 link Co-authored-by: Andrei Maiboroda --- EIPS/eip-3540.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index 115239cb2138b..16a132eb2e6c1 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -229,7 +229,7 @@ We have considered different questions for the sections: ### Data-only contracts -Moved to the section [Lack of `EXTDATACOPY` in EIP-7480](./eip-7480#lack-of-extdatacopy.md). +Moved to the section [Lack of `EXTDATACOPY` in EIP-7480](./eip-7480.md#lack-of-extdatacopy). ### `PC` starts with 0 at the code section From 27242708b7175e796d219e166a11e47a4261b606 Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Mon, 4 Mar 2024 17:50:31 +0100 Subject: [PATCH 13/21] Mention rejecting of old *CALL instructions --- EIPS/eip-3540.md | 1 + 1 file changed, 1 insertion(+) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index 16a132eb2e6c1..a4889a07172a9 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -171,6 +171,7 @@ For an EOF contract: - Execution starts at the first byte of code section 0, and PC is set to 0. - `PC` is scoped to the executing code section - `CODESIZE`, `CODECOPY`, `EXTCODESIZE`, `EXTCODECOPY`, `EXTCODEHASH`, `GAS` are rejected by validation in EOF contracts, with no replacements +- `CALL`, `DELEGATECALL`, `STATICCALL` are rejected by validation in EOF contracts, replacement instructions to be introduced in a separate EIP. - `DELEGATECALL2` (`DELEGATECALL` replacement) from an EOF contract to a legacy contract is disallowed, and it returns 0 to signal failure. Only initial gas cost of `DELEGATECALL2` is consumed (similarly to the call depth check) and the target address still becomes warm. We allow legacy to EOF path for existing proxy contracts to be able to use EOF upgrades. For a legacy contract: From 55902a5839ed90bc6c5df944f59e08b218432794 Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Tue, 5 Mar 2024 14:42:29 +0100 Subject: [PATCH 14/21] Update names of instructions c.f. ipsilon/eof#64 --- EIPS/eip-3540.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index a4889a07172a9..ca555ec07a9a0 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -49,7 +49,7 @@ In order to guarantee that every EOF-formatted contract in the state is valid, w If code starts with the `MAGIC`, it is considered to be EOF formatted, otherwise it is considered to be *legacy* code. For clarity, the `MAGIC` together with a version number *n* is denoted as the *EOFn prefix*, e.g. *EOF1 prefix*. -EOF-formatted contracts are created using new instructions `CREATE3`, `CREATE4` and `RETURNCONTRACT`. They are introduced in a separate [EIP-7620](./eip-7620.md). +EOF-formatted contracts are created using new instructions `EOFCREATE`, `TXCREATE` and `RETURNCONTRACT`. They are introduced in a separate [EIP-7620](./eip-7620.md). The opcode `0xEF` is currently an undefined instruction, therefore: *It pops no stack items and pushes no stack items, and it causes an exceptional abort when executed.* This means legacy *initcode* or already deployed legacy *code* starting with this instruction will continue to abort execution. @@ -61,7 +61,7 @@ We introduce *code validation* for new contract creation. To achieve this, we de Legacy code is not affected by EOF code validation. -Code validation is performed during `CREATE4` instruction, and is elaborated on in [EIP-3670](./eip-3670.md) and the contract creation [EIP-7620](./eip-7620.md). +Code validation is performed during `TXCREATE` instruction, and is elaborated on in [EIP-3670](./eip-3670.md) and the contract creation [EIP-7620](./eip-7620.md). The EOF format itself and its formal validation are described in the following sections. ### Container specification @@ -172,7 +172,7 @@ For an EOF contract: - `PC` is scoped to the executing code section - `CODESIZE`, `CODECOPY`, `EXTCODESIZE`, `EXTCODECOPY`, `EXTCODEHASH`, `GAS` are rejected by validation in EOF contracts, with no replacements - `CALL`, `DELEGATECALL`, `STATICCALL` are rejected by validation in EOF contracts, replacement instructions to be introduced in a separate EIP. -- `DELEGATECALL2` (`DELEGATECALL` replacement) from an EOF contract to a legacy contract is disallowed, and it returns 0 to signal failure. Only initial gas cost of `DELEGATECALL2` is consumed (similarly to the call depth check) and the target address still becomes warm. We allow legacy to EOF path for existing proxy contracts to be able to use EOF upgrades. +- `EXTDCALL` (`DELEGATECALL` replacement) from an EOF contract to a legacy contract is disallowed, and it returns 0 to signal failure. Only initial gas cost of `EXTDCALL` is consumed (similarly to the call depth check) and the target address still becomes warm. We allow legacy to EOF path for existing proxy contracts to be able to use EOF upgrades. For a legacy contract: From ae9dccfea1f104e916917fe21e117f1a8f44c69a Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Thu, 7 Mar 2024 19:04:25 +0100 Subject: [PATCH 15/21] Update to revised EXT*CALL rename --- EIPS/eip-3540.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index ca555ec07a9a0..e4c4a1c5babc3 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -172,7 +172,7 @@ For an EOF contract: - `PC` is scoped to the executing code section - `CODESIZE`, `CODECOPY`, `EXTCODESIZE`, `EXTCODECOPY`, `EXTCODEHASH`, `GAS` are rejected by validation in EOF contracts, with no replacements - `CALL`, `DELEGATECALL`, `STATICCALL` are rejected by validation in EOF contracts, replacement instructions to be introduced in a separate EIP. -- `EXTDCALL` (`DELEGATECALL` replacement) from an EOF contract to a legacy contract is disallowed, and it returns 0 to signal failure. Only initial gas cost of `EXTDCALL` is consumed (similarly to the call depth check) and the target address still becomes warm. We allow legacy to EOF path for existing proxy contracts to be able to use EOF upgrades. +- `EXTDELEGATECALL` (`DELEGATECALL` replacement) from an EOF contract to a legacy contract is disallowed, and it returns 0 to signal failure. Only initial gas cost of `EXTDELEGATECALL` is consumed (similarly to the call depth check) and the target address still becomes warm. We allow legacy to EOF path for existing proxy contracts to be able to use EOF upgrades. For a legacy contract: From b45d6076c2e9bd59b211eb9e4d1e0d85eff089ed Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Thu, 7 Mar 2024 19:18:17 +0100 Subject: [PATCH 16/21] Remove circular EIP dependencies --- EIPS/eip-3540.md | 58 ++++++++++++++++++++++++------------------------ 1 file changed, 29 insertions(+), 29 deletions(-) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index e4c4a1c5babc3..f0401fd22d3fa 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -8,7 +8,7 @@ status: Review type: Standards Track category: Core created: 2021-03-16 -requires: 3541, 3860, 4750, 5450, 7620 +requires: 3541, 3860 --- ## Abstract @@ -49,7 +49,7 @@ In order to guarantee that every EOF-formatted contract in the state is valid, w If code starts with the `MAGIC`, it is considered to be EOF formatted, otherwise it is considered to be *legacy* code. For clarity, the `MAGIC` together with a version number *n* is denoted as the *EOFn prefix*, e.g. *EOF1 prefix*. -EOF-formatted contracts are created using new instructions `EOFCREATE`, `TXCREATE` and `RETURNCONTRACT`. They are introduced in a separate [EIP-7620](./eip-7620.md). +EOF-formatted contracts are created using new instructions which are introduced in a separate EIP. The opcode `0xEF` is currently an undefined instruction, therefore: *It pops no stack items and pushes no stack items, and it causes an exceptional abort when executed.* This means legacy *initcode* or already deployed legacy *code* starting with this instruction will continue to abort execution. @@ -61,7 +61,7 @@ We introduce *code validation* for new contract creation. To achieve this, we de Legacy code is not affected by EOF code validation. -Code validation is performed during `TXCREATE` instruction, and is elaborated on in [EIP-3670](./eip-3670.md) and the contract creation [EIP-7620](./eip-7620.md). +Code validation is performed during contract creation, and is elaborated on in separate EIPs. The EOF format itself and its formal validation are described in the following sections. ### Container specification @@ -117,35 +117,35 @@ types_section := (inputs, outputs, max_stack_height)+ #### Header -| name | length | value | description | -|------------------------|----------|---------------|-----------------------------------------------------------------------------------------| -| magic | 2 bytes | 0xEF00 | | -| version | 1 byte | 0x01 | EOF version | -| kind_type | 1 byte | 0x01 | kind marker for type section | -| type_size | 2 bytes | 0x0004-0x1000 | 16-bit unsigned big-endian integer denoting the length of the type section content, 4 bytes per code section | -| kind_code | 1 byte | 0x02 | kind marker for code size section | -| num_code_sections | 2 bytes | 0x0001-0x0400 | 16-bit unsigned big-endian integer denoting the number of the code sections | -| code_size | 2 bytes | 0x0001-0xFFFF | 16-bit unsigned big-endian integer denoting the length of the code section content | -| kind_container | 1 byte | 0x03 | kind marker for container size section | -| num_container_sections | 2 bytes | 0x0001-0x00FF | 16-bit unsigned big-endian integer denoting the number of the container sections | -| container_size | 2 bytes | 0x0001-0xFFFF | 16-bit unsigned big-endian integer denoting the length of the container section content | -| kind_data | 1 byte | 0x04 | kind marker for data size section | -| data_size | 2 bytes | 0x0000-0xFFFF | 16-bit unsigned big-endian integer denoting the length of the data section content (*) | -| terminator | 1 byte | 0x00 | marks the end of the header | - -(*) For not yet deployed containers this can be greater than the actual content length, see Data Section Lifecycle section in the contract creation [EIP-7620](./eip-7620.md#data-section-lifecycle). +| name | length | value | description | +|------------------------|----------|---------------|--------------------------------------------------------------------------------------------------------------| +| magic | 2 bytes | 0xEF00 | | +| version | 1 byte | 0x01 | EOF version | +| kind_type | 1 byte | 0x01 | kind marker for type section | +| type_size | 2 bytes | 0x0004-0x1000 | 16-bit unsigned big-endian integer denoting the length of the type section content, 4 bytes per code section | +| kind_code | 1 byte | 0x02 | kind marker for code size section | +| num_code_sections | 2 bytes | 0x0001-0x0400 | 16-bit unsigned big-endian integer denoting the number of the code sections | +| code_size | 2 bytes | 0x0001-0xFFFF | 16-bit unsigned big-endian integer denoting the length of the code section content | +| kind_container | 1 byte | 0x03 | kind marker for container size section | +| num_container_sections | 2 bytes | 0x0001-0x00FF | 16-bit unsigned big-endian integer denoting the number of the container sections | +| container_size | 2 bytes | 0x0001-0xFFFF | 16-bit unsigned big-endian integer denoting the length of the container section content | +| kind_data | 1 byte | 0x04 | kind marker for data size section | +| data_size | 2 bytes | 0x0000-0xFFFF | 16-bit unsigned big-endian integer denoting the length of the data section content (*) | +| terminator | 1 byte | 0x00 | marks the end of the header | + +(*) For not yet deployed containers this can be greater than the actual content length. #### Body -| name | length | value | description | -|-------------------|----------|---------------|---------------------------------------------------------------------------------------| -| types_section | variable | n/a | stores code section metadata | -| inputs | 1 byte | 0x00-0x7F | number of stack elements the code section consumes | -| outputs | 1 byte | 0x00-0x80 | number of stack elements the code section returns or 0x80 for non-returning functions | -| max_stack_height | 2 bytes | 0x0000-0x03FF | maximum number of elements ever placed onto the operand stack by the code section | -| code_section | variable | n/a | arbitrary bytecode | -| container_section | variable | n/a | arbitrary EOF-formatted container | -| data_section | variable | n/a | arbitrary sequence of bytes | +| name | length | value | description | +|-------------------|----------|---------------|--------------------------------------------------------------------------------------------| +| types_section | variable | n/a | stores code section metadata | +| inputs | 1 byte | 0x00-0x7F | number of stack elements the code section consumes | +| outputs | 1 byte | 0x00-0x80 | number of stack elements the code section returns or 0x80 for non-returning functions | +| max_stack_height | 2 bytes | 0x0000-0x03FF | maximum number of elements ever placed onto the operand stack by the code section | +| code_section | variable | n/a | arbitrary bytecode | +| container_section | variable | n/a | arbitrary EOF-formatted container | +| data_section | variable | n/a | arbitrary sequence of bytes | See [EIP-4750](./eip-4750.md) for more information on the type section content. See [EIP-6206](./eip-6206.md) for more information on non-returning functions. From 94f9d2750e557270520dc74c39802e2ede09d55f Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Mon, 11 Mar 2024 13:46:15 +0100 Subject: [PATCH 17/21] Remove conflicting spec on EXTDELEGATECALL status code returned --- EIPS/eip-3540.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index f0401fd22d3fa..135dc9ba0183d 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -172,7 +172,7 @@ For an EOF contract: - `PC` is scoped to the executing code section - `CODESIZE`, `CODECOPY`, `EXTCODESIZE`, `EXTCODECOPY`, `EXTCODEHASH`, `GAS` are rejected by validation in EOF contracts, with no replacements - `CALL`, `DELEGATECALL`, `STATICCALL` are rejected by validation in EOF contracts, replacement instructions to be introduced in a separate EIP. -- `EXTDELEGATECALL` (`DELEGATECALL` replacement) from an EOF contract to a legacy contract is disallowed, and it returns 0 to signal failure. Only initial gas cost of `EXTDELEGATECALL` is consumed (similarly to the call depth check) and the target address still becomes warm. We allow legacy to EOF path for existing proxy contracts to be able to use EOF upgrades. +- `EXTDELEGATECALL` (`DELEGATECALL` replacement) from an EOF contract to a legacy contract is disallowed, and it should fail in the same mode as if the call depth check failed. We allow legacy to EOF path for existing proxy contracts to be able to use EOF upgrades. For a legacy contract: From edf3dd236bebaf822a80e9658ebbd80f9e18bcff Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Wed, 13 Mar 2024 15:05:26 +0100 Subject: [PATCH 18/21] Postpone mentions of non-returning functions to EIP-6206 --- EIPS/eip-3540.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index 135dc9ba0183d..4c61b5df66680 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -141,14 +141,13 @@ types_section := (inputs, outputs, max_stack_height)+ |-------------------|----------|---------------|--------------------------------------------------------------------------------------------| | types_section | variable | n/a | stores code section metadata | | inputs | 1 byte | 0x00-0x7F | number of stack elements the code section consumes | -| outputs | 1 byte | 0x00-0x80 | number of stack elements the code section returns or 0x80 for non-returning functions | +| outputs | 1 byte | 0x00-0x7F | number of stack elements the code section returns | | max_stack_height | 2 bytes | 0x0000-0x03FF | maximum number of elements ever placed onto the operand stack by the code section | | code_section | variable | n/a | arbitrary bytecode | | container_section | variable | n/a | arbitrary EOF-formatted container | | data_section | variable | n/a | arbitrary sequence of bytes | See [EIP-4750](./eip-4750.md) for more information on the type section content. -See [EIP-6206](./eip-6206.md) for more information on non-returning functions. #### EOF version 1 validation rules From 35198749ca3fa0107c91366bf5d60520821a5093 Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Mon, 15 Apr 2024 11:59:42 +0200 Subject: [PATCH 19/21] Respond to feedback, incl. mention non-returning --- EIPS/eip-3540.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index 4c61b5df66680..49ddccaba4880 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -149,6 +149,8 @@ types_section := (inputs, outputs, max_stack_height)+ See [EIP-4750](./eip-4750.md) for more information on the type section content. +**NOTE**: A special value of `outputs` being `0x80` is designated to denote non-returning functions as defined in a separate EIP. + #### EOF version 1 validation rules The following validity constraints are placed on the container format: @@ -156,10 +158,10 @@ The following validity constraints are placed on the container format: - `version` must be `0x01` - `types_size` is divisible by `4` - the number of code sections must be equal to `types_size / 4` -- the number of code sections must not exceed `1024` +- the number of code sections must be greater than `0` and not exceed `1024` - `code_size` may not be `0` -- the number of container sections must not exceed `256` -- `container_size` may not be `0`, but container sections are optional +- the number of container sections must not exceed `256`. The number of container sections may not be `0`, if declared in the header +- `container_size` may not be `0` - data section is mandatory, but `data_size` may be `0` - data body length may be shorter than `data_size` for a not yet deployed container From 111806807cb3b78617534ac32af62652d0535eb5 Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Tue, 16 Apr 2024 12:23:45 +0200 Subject: [PATCH 20/21] Respond to feedback - fix various references --- EIPS/eip-3540.md | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index 49ddccaba4880..6f0cd92493afc 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -87,7 +87,7 @@ The list of section headers is terminated with the *section headers terminator b #### Container validation rules -1. `version` MUST NOT be `0`.[^1](#eof-version-range-start-with-1) +1. `version` MUST NOT be `0`. 2. `section_kind` MUST NOT be `0`. The value `0` is reserved for *section headers terminator byte*. 3. There MUST be at least one section (and therefore section header). 5. Stray bytes outside of sections MUST NOT be present. This includes trailing bytes after the last section. @@ -231,11 +231,7 @@ We have considered different questions for the sections: ### Data-only contracts -Moved to the section [Lack of `EXTDATACOPY` in EIP-7480](./eip-7480.md#lack-of-extdatacopy). - -### `PC` starts with 0 at the code section - -The value for `PC` is specified to start at 0 and to be within the active *code* section. An alternative was keeping `PC` to operate on the whole *container*. However, the new EOF EVM should only care about traversing *code*. +See section [Lack of `EXTDATACOPY` in EIP-7480](./eip-7480.md#lack-of-extdatacopy). ### EOF1 contracts can only `DELEGATECALL` EOF1 contracts From 088f5e25cf537858f1556f2215ac2ed1e0eccf0f Mon Sep 17 00:00:00 2001 From: pdobacz <5735525+pdobacz@users.noreply.github.com> Date: Tue, 16 Apr 2024 14:30:17 +0200 Subject: [PATCH 21/21] Remove all considerations about PC, defer it to 4750 --- EIPS/eip-3540.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/EIPS/eip-3540.md b/EIPS/eip-3540.md index 6f0cd92493afc..2f013de4065ba 100644 --- a/EIPS/eip-3540.md +++ b/EIPS/eip-3540.md @@ -169,8 +169,7 @@ The following validity constraints are placed on the container format: For an EOF contract: -- Execution starts at the first byte of code section 0, and PC is set to 0. -- `PC` is scoped to the executing code section +- Execution starts at the first byte of code section 0 - `CODESIZE`, `CODECOPY`, `EXTCODESIZE`, `EXTCODECOPY`, `EXTCODEHASH`, `GAS` are rejected by validation in EOF contracts, with no replacements - `CALL`, `DELEGATECALL`, `STATICCALL` are rejected by validation in EOF contracts, replacement instructions to be introduced in a separate EIP. - `EXTDELEGATECALL` (`DELEGATECALL` replacement) from an EOF contract to a legacy contract is disallowed, and it should fail in the same mode as if the call depth check failed. We allow legacy to EOF path for existing proxy contracts to be able to use EOF upgrades.