From fcb58d33257d80ff6b6f7489c2fc6432efb89263 Mon Sep 17 00:00:00 2001 From: Oleksandr Zarudnyi Date: Thu, 19 Dec 2024 22:45:21 +0800 Subject: [PATCH] fix: replace the EraVM extension page with zksolc docs --- .../50.compiler/10.toolchain/10.index.md | 54 ++------ .../10.toolchain/{40.llvm.md => 20.llvm.md} | 0 .../50.compiler/10.toolchain/20.solidity.md | 121 ------------------ .../50.compiler/10.toolchain/30.vyper.md | 49 ------- .../20.specification/50.evmla-translator.md | 15 ++- .../60.instructions/10.index.md | 22 +--- .../60.instructions/21.extensions/00.index.md | 9 -- .../60.instructions/21.extensions/10.call.md | 86 ------------- .../21.extensions/20.verbatim.md | 87 ------------- .../60.instructions/21.extensions/_dir.yml | 1 - .../60.instructions/40.yul.md | 4 +- 11 files changed, 25 insertions(+), 423 deletions(-) rename content/20.zksync-protocol/50.compiler/10.toolchain/{40.llvm.md => 20.llvm.md} (100%) delete mode 100644 content/20.zksync-protocol/50.compiler/10.toolchain/20.solidity.md delete mode 100644 content/20.zksync-protocol/50.compiler/10.toolchain/30.vyper.md delete mode 100644 content/20.zksync-protocol/50.compiler/20.specification/60.instructions/21.extensions/00.index.md delete mode 100644 content/20.zksync-protocol/50.compiler/20.specification/60.instructions/21.extensions/10.call.md delete mode 100644 content/20.zksync-protocol/50.compiler/20.specification/60.instructions/21.extensions/20.verbatim.md delete mode 100644 content/20.zksync-protocol/50.compiler/20.specification/60.instructions/21.extensions/_dir.yml diff --git a/content/20.zksync-protocol/50.compiler/10.toolchain/10.index.md b/content/20.zksync-protocol/50.compiler/10.toolchain/10.index.md index 77423a59..e4de1eae 100644 --- a/content/20.zksync-protocol/50.compiler/10.toolchain/10.index.md +++ b/content/20.zksync-protocol/50.compiler/10.toolchain/10.index.md @@ -3,21 +3,15 @@ title: Compiler toolchain overview description: --- -This section introduces the zkEVM LLVM-based compiler toolchain for smart contract languages with Ethereum Virtual Machine (EVM) support. +This section introduces an LLVM-based compiler toolchain for Solidity and Vyper. The toolchain works on top of existing compilers and requires their output, which typically includes intermediate representations (IRs), abstract syntax trees (ASTs), and auxiliary contract metadata and documentation. -::callout{icon="i-heroicons-information-circle" color="blue"} -At the time of writing, we support Solidity and Vyper. -:: - The toolchain consists of the following: 1. [High-level source code compilers](#high-level-source-code-compilers): `solc` and `vyper`. 2. [IR compilers, front ends to LLVM](#ir-compilers): `zksolc` and `zkvyper`. -3. [The LLVM framework](/zksync-protocol/compiler/toolchain/llvm) with a zkEVM back end which emits zkEVM text assembly. -4. [The assembler](#assembler) which produces the zkEVM bytecode from text assembly. -5. [Hardhat plugins](#hardhat-plugins) which set up the environment. +3. [The LLVM framework](/zksync-protocol/compiler/toolchain/llvm) with EraVM and EVM back ends. ![Compiler Toolchain Visualization](/images/zk-stack/compiler-toolchain.png "Compiler Toolchain") @@ -26,10 +20,10 @@ The toolchain consists of the following: High-level source code is processed by third-party compilers. These compilers do the following: 1. Process and validate the high-level source code. -2. Translate the source code into IR and metadata. -3. Pass the IR and metadata to our IR compilers via the standard I/O streams. +2. Translate the source code to artifacts, namely IR and metadata. +3. Pass the IR and metadata to our IR compilers via stdout using standard JSON I/O. -We are using two high-level source code compilers at the time of writing: +We are using two high-level source code compilers: - [solc](https://github.com/ethereum/solc-bin): the official Solidity compiler. For more info, see the latest [Solidity documentation](https://docs.soliditylang.org/en/latest/). - [vyper](https://github.com/vyperlang/vyper/releases): the official Vyper compiler. For more info, see the latest [Vyper documentation](https://docs.vyperlang.org/en/latest/index.html). @@ -43,43 +37,21 @@ to build smart contracts on ZKsync Era. ## IR Compilers -Our toolchain includes LLVM front ends, written in Rust, that process the output of high-level source code compilers: +Our toolchain includes LLVM front ends written in Rust. The front ends process the output of high-level source code compilers: -- [zksolc](%%zk_git_repo_zksolc-bin%%) which calls `solc` as a child process. For more info, see the latest [zksolc documentation](/zksync-protocol/compiler/toolchain/solidity). -- [zkvyper](%%zk_git_repo_zkvyper-bin%%): which calls `vyper` as a child process. For more info, see the latest [zkvyper documentation](/zksync-protocol/compiler/toolchain/vyper). +- [zksolc](%%zk_git_repo_zksolc-bin%%) which calls `solc` as a child process. For more info, see the latest [zksolc documentation](https://matter-labs.github.io/era-compiler-solidity/latest/). +- [zkvyper](%%zk_git_repo_zkvyper-bin%%): which calls `vyper` as a child process. For more info, see the latest [zkvyper documentation](https://matter-labs.github.io/era-compiler-vyper/latest/). These IR compilers perform the following steps: -1. Receive the input, which is usually standard or combined JSON passed by the Hardhat plugin via standard input. -2. Save the relevant data, modify the input with zkEVM settings, and pass it to the underlying high-level source code compiler -which is called as a child process. +1. Receive the input, which is usually standard or combined JSON passed by a tool such as Foundry or Hardhat via standard input. +2. Save the relevant data, modify the input, and pass it to the underlying high-level source code compiler +called as a child process. 3. Receive the IR and metadata from the underlying compiler. 4. Translate the IR into LLVM IR, resolving dependencies with the help of metadata. -5. Optimize the LLVM IR with the powerful LLVM framework optimizer and emit zkEVM text assembly. +5. Optimize the LLVM IR with the powerful LLVM framework optimizer and emit bytecode. 6. Print the output matching the format of the input method the IR compiler is called with. Our IR compilers leverage I/O mechanisms which already exist in the high-level source code -compilers. They may modify the input and output to some extent, add data for features unique to zkEVM, +compilers. They may modify the input and output to some extent, add data for features unique to ZKsync EraVM, and remove unsupported feature artifacts. - -## Assembler - -The [assembler](%%zk_git_repo_era-zkEVM-assembly%%), which is written in Rust, compiles zkEVM assembly -to zkEVM bytecode. This tool is not a part of our LLVM back end as it uses several cryptographic libraries which are -easier to maintain outside of the framework. - -## Hardhat Plugins - -We recommend using our IR compilers via [their corresponding Hardhat plugins](/zksync-era/tooling/hardhat). -Add these plugins to the Hardhat's config file to compile new projects or migrate -existing ones to ZKsync Era. For a lower-level approach, download our compiler binaries via the -links above and use their CLI interfaces. - -### Installing and configuring plugins - -Add the plugins below to the Hardhat's config file to compile new projects or migrate -existing ones to ZKsync Era. For a lower-level approach, download our compiler binaries -[links above](#ir-compilers) and use their CLI interfaces. - -- [hardhat-zksync-solc documentation](/zksync-era/tooling/hardhat/plugins/hardhat-zksync-solc) -- [hardhat-zksync-vyper documentation](/zksync-era/tooling/hardhat/plugins/hardhat-zksync-vyper) diff --git a/content/20.zksync-protocol/50.compiler/10.toolchain/40.llvm.md b/content/20.zksync-protocol/50.compiler/10.toolchain/20.llvm.md similarity index 100% rename from content/20.zksync-protocol/50.compiler/10.toolchain/40.llvm.md rename to content/20.zksync-protocol/50.compiler/10.toolchain/20.llvm.md diff --git a/content/20.zksync-protocol/50.compiler/10.toolchain/20.solidity.md b/content/20.zksync-protocol/50.compiler/10.toolchain/20.solidity.md deleted file mode 100644 index c2c50c6c..00000000 --- a/content/20.zksync-protocol/50.compiler/10.toolchain/20.solidity.md +++ /dev/null @@ -1,121 +0,0 @@ ---- -title: Solidity compiler -description: ---- - -The compiler we provide as a part of our toolchain is called [zksolc](%%zk_git_repo_zksolc-bin%%). It -operates on IR and metadata received from the underlying [solc](https://docs.soliditylang.org/en/latest/) compiler, -which must be available in `$PATH`, or its path must be explicitly passed via the CLI (command-line interface). - -::callout{icon="i-heroicons-exclamation-triangle" color="amber"} -To safeguard the security and efficiency of your application, always use the latest compiler version. -:: - -## Usage - -Make sure your machine satisfies the [system requirements](%%zk_git_repo_era-compiler-solidity%%/tree/main#system-requirements). - -Using our compiler via the Hardhat plugin usually suffices. However, knowledge of its interface and I/O (input/output) -methods are crucial for integration, debugging, or contribution purposes. - -The CLI supports several I/O modes: - -1. Standard JSON. -2. Combined JSON. -3. Free-form output. - -All three modes use the standard JSON `solc` interface internally. This reduces the complexity of the `zksolc` -interface and facilitates testing. - -### Standard JSON - -The `zksolc` standard JSON I/O workflow closely follows that of the official `solc` compiler. However, `zksolc` does not -support some configuration settings which are only relevant to the EVM architecture. - -Additional zkEVM data is supported by `zksolc` but is omitted when passed to `solc`: - -- `settings/optimizer/mode`: sets the optimization mode. Available values: `0`, `1`, `2`, `3`, `s`, `z`. The default - setting is `3`. See [LLVM optimizer](llvm#optimizer). -- `settings/optimizer/fallback_to_optimizing_for_size`: tries to compile again in `z` mode if the bytecode is too large for zkEVM. -- `settings/optimizer/disable_system_request_memoization`: disables the memoization of data received in requests to System Contracts. - -Unsupported sections of the input JSON, ignored by `zksolc`: - -- `sources//urls` -- `sources/destructible` -- `settings/stopAfter` -- `settings/evmVersion` -- `settings/debug` -- `settings/metadata`: for zkEVM you can only append `keccak256` metadata hash to the bytecode. -- `settings/modelChecker` - -Additional zkEVM data inserted by `zksolc`: - -- `long_version`: the full `solc` version output. -- `zk_version`: the `zksolc` version. -- `contract/hash`: the hash of the zkEVM bytecode. -- `contract/factory_dependencies`: bytecode hashes of contracts created in the current contract with `CREATE`. - -[More details here](/zksync-protocol/differences/contract-deployment#note-on-factory-deps). - -Unsupported sections of the output JSON, ignored by `zksolc`: - -- `contracts///evm/bytecode`: replaced with a JSON object with zkEVM build data. -- `contracts///ewasm` - -See the complete standard JSON data structures in [the zksolc repository](%%zk_git_repo_era-compiler-solidity%%/tree/main/src/solc/standard_json). - -### Combined JSON - -The `zksolc` standard JSON I/O workflow closely follows that of the official `solc` compiler. However, `zksolc` does not -support some configuration settings which are only relevant to the EVM architecture. - -Combined JSON is only an output format; there is no combined JSON input format. Instead, CLI arguments are -used for configuration. - -Additional zkEVM data, inserted by `zksolc`: - -- `zk_version`: the version of `zksolc`. -- `contract/factory_deps`: bytecode hashes of contracts created by the current contract with `CREATE`. - -[More details here](/zksync-protocol/differences/contract-deployment#note-on-factory-deps). - -Unsupported combined JSON flags, rejected by `zksolc`: - -- `function-debug` -- `function-debug-runtime` -- `generated-sources` -- `generated-sources-runtime` -- `opcodes` -- `srcmap` -- `srcmap-runtime` - -For more information, see the complete combined JSON data structures in [the zksolc repository](%%zk_git_repo_era-compiler-solidity%%/tree/main/src/solc/combined_json). - -### Free-form output - -This output format is utilized in Yul and LLVM IR compilation modes. These modes currently only support compiling a single -file. Only `--asm` and `--bin` output flags are supported, so this mode can be useful for debugging and prototyping. - -## Limitations - -Currently, Solidity versions as old as `0.4.12` are supported, although **we strongly recommend using** the latest -supported revision of `0.8`, as older versions contain known bugs and have limitations dictated by the absence of IR with -sufficient level of abstraction over EVM. - -Projects written in Solidity `>=0.8` are compiled by default through the Yul pipeline, whereas those written in `<=0.7` are compiled -via EVM legacy assembly which is a less friendly IR due to its obfuscation of control-flow and call graphs. -Due to this obfuscation, there are several limitations in ZKsync for contracts written in Solidity `<=0.7`: - -1. Recursion on the stack is not supported. -2. Internal function pointers are not supported. -3. Contract size and performance may be affected. - -## Using libraries - -The usage of libraries in Solidity is supported in ZKsync Era with the following considerations: - -- If a Solidity library can be inlined (i.e. it only contains `private` or `internal` methods), it can be used without - any additional configuration. -- However, if a library contains at least one `public` or `external` method, it cannot be inlined and its address needs - to be passed explicitly to the compiler; see [compiling non-inlinable libraries](/zksync-era/tooling/hardhat/guides/compiling-libraries#compiling-non-inlinable-libraries). diff --git a/content/20.zksync-protocol/50.compiler/10.toolchain/30.vyper.md b/content/20.zksync-protocol/50.compiler/10.toolchain/30.vyper.md deleted file mode 100644 index 4cc73d95..00000000 --- a/content/20.zksync-protocol/50.compiler/10.toolchain/30.vyper.md +++ /dev/null @@ -1,49 +0,0 @@ ---- -title: Vyper compiler -description: ---- - -The Vyper compiler we provide as part of our toolchain is called [zkvyper](%%zk_git_repo_zkvyper-bin%%). It -operates on Vyper’s LLL IR, and metadata received from the underlying [vyper](https://docs.vyperlang.org/en/latest/index.html) compiler, -which must be available in `$PATH`, or its path must be explicitly passed via the CLI (command-line interface). - -::callout{icon="i-heroicons-exclamation-triangle" color="amber"} -To safeguard the security and efficiency of your application, always use the latest compiler version. -:: - -## Usage - -Make sure your machine satisfies the [system requirements](%%zk_git_repo_era-compiler-vyper%%/tree/main#system-requirements). - -Using our compiler via the Hardhat plugin usually suffices. However, knowledge of its interface and I/O (input/output) -methods are crucial for integration, debugging, or contribution purposes. - -#### Combined JSON - -The `zkvyper` standard JSON I/O workflow closely follows that of the official `vyper` compiler. However, `zkvyper` does not -support some configuration settings which are only relevant to the EVM architecture. - -Combined JSON is only an output format; there is no combined JSON input format. Instead, CLI arguments are -used for configuration. - -Additional zkEVM data is inserted into the output combined JSON by `zksolc`: - -- `zk_version`: the `zksolc` version. -- `contract/factory_deps`: bytecode hashes of contracts created in the current contract with `CREATE`. - Since Vyper does not support `CREATE` directly, only the forwarder can be present in this mapping. - - [More details here](/zksync-protocol/differences/contract-deployment#note-on-factory-deps). - -Regardless of the requested output, only the `combined_json`, `abi`, `method_identifiers`, `bytecode`, `bytecode_runtime` -flags are supported, while the rest are ignored. - -Other output formats are available via the `-f` option. Check out `vyper --help` for more details. - -## Limitations - -Versions from 0.3.4 to 0.3.8 are not supported. The only supported versions are 0.3.3, 0.3.9, 0.3.10. - -Also, since there is no separation of deploy and runtime code on EraVM, the following Vyper built-ins are not supported: - -- `create_copy_of` -- `create_from_blueprint` diff --git a/content/20.zksync-protocol/50.compiler/20.specification/50.evmla-translator.md b/content/20.zksync-protocol/50.compiler/20.specification/50.evmla-translator.md index a94f534f..9e1dec97 100644 --- a/content/20.zksync-protocol/50.compiler/20.specification/50.evmla-translator.md +++ b/content/20.zksync-protocol/50.compiler/20.specification/50.evmla-translator.md @@ -3,20 +3,21 @@ title: EVM Legacy Assembly translator description: --- -There are two Solidity IRs used in our pipeline: Yul and EVM legacy assembly. The former is used for older versions of +There are two Solidity IRs used in our pipeline: Yul and EVM legacy assembly. The latter is used for older versions of Solidity, more precisely <=0.7. EVM legacy assembly is very challenging to translate to LLVM IR, since it obfuscates the control flow of the program and -uses a lot of dynamic jumps. Most of the jumps can be translated to static ones by using a static analysis of EVM assembly, -but some of jumps are impossible to resolve statically. For example, internal function pointers can be written -to memory or storage, and then loaded and called. Recursion is another case we have skipped for now, as there is another -stack frame allocated on every iteration, preventing the static analyzer from resolving the jumps. +uses a lot of dynamic jumps. Most of the jumps can be translated to static ones by using a form of static analysis, +but some of them are impossible to resolve this way. There are several issues with the existing codegen of the original *solc* compiler: +1. Internal function pointers are written to memory or storage, and then loaded and called dynamically. +2. With local recursion, there is another stack frame allocated on every iteration. +3. Some try-catch patterns leave values on the stack, hindering stack analysis. -Both issues are being worked on in our fork of the Solidity compiler, where we are changing the codegen to remove the +All the issues have been resolved in our fork of the Solidity compiler, where we have changed the codegen to remove the dynamic jumps and add the necessary metadata. Below you can see a minimal example of a Solidity contract and its EVM legacy assembly translated to LLVM IR which is -eventually compiled to EraVM assembly. +eventually compiled to EraVM bytecode. ## Source Code diff --git a/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/10.index.md b/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/10.index.md index 07fd802d..01895324 100644 --- a/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/10.index.md +++ b/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/10.index.md @@ -15,7 +15,7 @@ stated explicitly in the description of each instruction. ## Addressing modes EraVM is a register-based virtual machine with different addressing modes. -It overrides all stack mechanics described in [the original EVM opcodes documentation](https://www.evm.codes/) including +It overrides all stack mechanics described in [the official documentation of EVM opcodes](https://www.evm.codes/) including errors they produce on EVM. ## Solidity Intermediate Representations (IRs) @@ -27,22 +27,4 @@ Every instruction is translated via two IRs available in the Solidity compiler u ## Yul Extensions -At the moment there is no way of adding ZKsync-specific instructions to Yul as long as we use the official Solidity -compiler, which would produce an error on an unknown instruction. - -There are two ways of supporting such instructions: one for Solidity and one for Yul. - -### The Solidity Mode - -In Solidity we have introduced **call simulations**. They are not actual calls, as they are substituted by our Yul -translator with the needed instruction, depending on the constant address. This way the Solidity compiler is not -optimizing them out and is not emitting compilation errors. - -The reference of such extensions is coming soon. - -### The Yul Mode - -The non-call ZKsync-specific instructions are only available in the Yul mode of **zksolc**. -To have better compatibility, they are implemented as `verbatim` instructions with some predefined keys. - -The reference of such extensions is coming soon. +ZKsync EraVM introduced a set of EraVM-specific instructions. The set is documented at [the official *zksolc* documentation](https://matter-labs.github.io/era-compiler-solidity/latest/). \ No newline at end of file diff --git a/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/21.extensions/00.index.md b/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/21.extensions/00.index.md deleted file mode 100644 index 45caeb1f..00000000 --- a/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/21.extensions/00.index.md +++ /dev/null @@ -1,9 +0,0 @@ ---- -title: Overview -description: ---- - -Since we have no control over the Solidity compiler, we are using temporary hacks to support ZKsync-specific instructions: - -- [Call substitutions](/zksync-protocol/compiler/specification/instructions/extensions/call) -- [Verbatim substitutions](/zksync-protocol/compiler/specification/instructions/extensions/verbatim) diff --git a/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/21.extensions/10.call.md b/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/21.extensions/10.call.md deleted file mode 100644 index 84f595d8..00000000 --- a/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/21.extensions/10.call.md +++ /dev/null @@ -1,86 +0,0 @@ ---- -title: ZKsync Era Extension Simulation (call) -description: ---- - -::callout{icon="i-heroicons-light-bulb" color="blue"} -NOTES: - -- changed META - it can be used for MSIZE simulation -- setting ergs per pubdata is done by separate opcode now (not part of `near_call`) -- incrementing TX counter is done by separate opcode now (not part of `far_call`) -:: - -Our VM has some opcodes that are not expressible in Solidity, -but we can simulate them on compiler level by abusing “CALL” instruction. -We use 2nd parameter of “CALL” (address) as a marker, -and remaining 6 parameters as input parameters -(we use “address”-like field since it’s kind of shorter type, if assembly block cares about types in Solidity). -Unfortunately “CALL” returns only 1 stack parameter, but it looks sufficient for our purposes. - -Please note, that some of the methods don’t modify state, -so STATICCALL instead of CALL should be used for them. -The type of the needed method is indicated in the rightmost column. - -Call types are not validated and do not affect the simulation behavior, -unless specified otherwise, like in `raw_far_call` and `system_call` simulations, where the call type is passed through. - -For some simulations below we assume that there exist a hidden global pseudo-variable called `ACTIVE_PTR` for manipulations, -since one can not easily load pointer value into Solidity’s variable. - -| Simulated opcode | CALL param 0 (gas) | CALL param 1 (address) | CALL param 2 (value) | CALL param 3 (input offset) | CALL param 4 (input length) | CALL param 5 (output offset) | CALL param 6 (output length) | Return value | call type | LLVM implementation | Motivation | -| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | -| to_l1(is_first, in1, in2) | if_first (bool) | 0xFFFF | in1 (u256) | in2 (u256) | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | _ | call | @llvm.syncvm.tol1(i256 %in1, i256 %in2, i256 %is_first) | Send messages to L1 | -| code_source | 0 | 0xFFFE | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | address | staticcall | @llvm.syncvm.context(i256 %param) ; param == 2 (see SyncVM.h) | Largely to be able to catch “delegatecalls” in system contracts (by comparing this == code_source) | -| precompile(in1, ergs_to_burn, out0) | in1 (u256) | 0xFFFD | - | ergs_to_burn (u32) | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | out0 | staticcall | @llvm.syncvm.precompile(i256 %in1, i256 %ergs) | way to trigger call to precompile in VM | -| decommit(versioned_hash, ergs_to_burn, out0) | versioned_hash (u256) | 0xFFDD | - | ergs_to_burn (u32) | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | out0 | staticcall | saves the result pointer to @ptr_decommit | way to trigger decommit in VM | -| meta | 0 | 0xFFFC | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | [u256 tight packing](https://github.com/matter-labs/EraVM_opcode_defs/blob/b000abebc27f88919e0087b7604b8c71ba5b3daf/src/definitions/abi/meta.rs#L6) | staticcall | @llvm.syncvm.context(i256 %param) ; param == 3 (see SyncVM.h) | way to trigger call to meta information about some small pieces of the state in VM | -| mimic_call(to, abi_data, implicit r5 = who to mimic) | who_to_call | 0xFFFB | 0 | abi_data | who_to_mimic | 0 | 0 | WILL mess up the registers and WILL use r1-r4 for our standard ABI convention and r5 for the extra who_to_mimic argument | any in the code; mimic call in the bytecode | Runtime {i256, i1} __mimiccall(i256, i256, i256, {i256, i1}) | | -| system_mimic_call(to, abi_data, implicit r3, r4, r5 = who to mimic) | who_to_call | 0xFFFA | 0 | abi_data | who_to_mimic | value_to_put_into_r3 | value_to_put_into_r4 | WILL mess up the registers and WILL use r1-r4 for our standard ABI convention and r5 for the extra who_to_mimic argument | any in the code; mimic call in the bytecode | Runtime *{i256, i1} __mimiccall(i256, i256, i256, {i256, i1}) | | -| mimic_call_byref(to, ACTIVE_PTR, implicit r5 = who to mimic) | who_to_call | 0xFFF9 | 0 | 0 | who_to_mimic | 0 | 0 | WILL mess up the registers and WILL use r1-r4 for our standard ABI convention and r5 for the extra who_to_mimic argument | any in the code; mimic call in the bytecode | Runtime {i256, i1} __mimiccall(i8 addrspace(3), i256, i256, {i256, i1}) | Same as one above, but takes ABI data from ACTIVE_PTR | -| system_mimic_call_byref(to, ACTIVE_PTR, implicit r3, r4, r5 = who to mimic) | who_to_call | 0xFFF8 | 0 | 0 | who_to_mimic | value_to_put_into_r3 | value_to_put_into_r4 | WILL mess up the registers and WILL use r1-r4 for our standard ABI convention and r5 for the extra who_to_mimic argument | any in the code; mimic call in the bytecode | Runtime {i256, i1} __mimiccall(i8 addrspace(3), i256, i256, {i256, i1}) | Same as one above, but takes ABI data from ACTIVE_PTR | -| raw_far_call | who_to_call | 0xFFF7 | 0 | 0 | abi_data (CAN be with “to system = true”) | output_offset | output_length | Same as for EVM call | call | static | delegate (the call type is preserved)
It’s very similar to “system_call” described below, but for the cases when we only need to have to_system = true set in ABI (responsibility of the user, NOT the compiler), but we do not actually need to pass anything through r3 and r4 (so we can save on setting them or zeroing them, whatever) | -| raw_far_call_byref | who_to_call | 0xFFF6 | 0 | 0 | 0xFFFF to prevent optimizing out by Yul | output_offset | output_length | Same as for EVM call | call | static | delegate (the call type is preserved)
Same as one above, but takes ABI data from ACTIVE_PTR | -| system_call | who_to_call | 0xFFF5 | value_to_put_into_r3 (only for call with 7 arguments) | value_to_put_into_r4 | abi_data (MUST have “to system” set) | value_to_put_into_r5 | value_to_put_into_r6 | Same as for EVM call | call | static | delegate (the call type is preserved) to call system contracts, like MSG_VALUE_SIMULATOR. We may need 4 different formal definitions for cases when we would want to have integer/ptr in r3 and r4 | -| system_call_byref | who_to_call | 0xFFF4 | value_to_put_into_r3 (only for call with 7 arguments) | value_to_put_into_r4 | 0xFFFF to prevent optimizing out by Yul | value_to_put_into_r5 | value_to_put_into_r6 | Same as for EVM call | call | static | delegate (the call type is preserved) to call system contracts, like MSG_VALUE_SIMULATOR. We may need 4 different formal definitions for cases when we would want to have integer/ptr in r3 and r4
Same as one above, but takes ABI data from ACTIVE_PTR | -| set_context_u128 | 0 | 0xFFF3 | value | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | - | call | | | -| set_pubdata_price | in1 | 0xFFF2 | 0 | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | - | call | context.set_ergs_per_pubdata in1 in assembly | | -| increment_tx_counter | 0 | 0xFFF1 | 0 | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | - | call | context.inc_tx_num in assembly | | -| ptr_calldata | 0 | 0xFFF0 | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | one passed in r1 on far_call to the callee (save in very first instructions on entry) | Loads as INTEGER! | -| call_flags | 0 | 0xFFEF | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | one passed in r2 on far_call to the callee (save in very first instructions on entry) | | -| ptr_return_data | 0 | 0xFFEE | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | one passed in r1 on return from far_call back to the caller (save in very first instruction in the corresponding branch!) | Loads as INTEGER! | -| event_initialize | in1 | 0xFFED | - | in2 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | call | | | -| event_write | in1 | 0xFFEC | - | in2 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | call | | | -| load_calldata_into_active_ptr | 0 | 0xFFEB | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | loads value of @calldataptr (from r1 at the entry point of the contract into virtual ACTIVE_PTR)ACTIVE_PTR | | -| load_returndata_into_active_ptr | 0 | 0xFFEA | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | loads value of the latest @returndataptr (from the r1 at the point of return from the child into virtual ACTIVE_PTR) | | -| load_decommit_into_active_ptr | 0 | 0xFFDC | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | loads value of the @ptr_decommit into virtual ACTIVE_PTR | | -| ptr_add_into_active | in1 | 0xFFE9 | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | performs ptr.add ACTIVE_PTR, in1, ACTIVE_PTR | | -| ptr_shrink_into_active | in1 | 0xFFE8 | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | performs ptr.shrink ACTIVE_PTR, in1, ACTIVE_PTR | | -| ptr_pack_into_active | in1 | 0xFFE7 | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | performs ptr.pack ACTIVE_PTR, in1, ACTIVE_PTR | | -| multiplication_high | in1 | 0xFFE6 | - | in2 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | Returns the higher register (the overflown part) | staticcall | | | -| extra_abi_data | in1 | 0xFFE5 | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | ones passed in r3-r12 on far_call to the callee (saved in the very first instructions in the entry) | | -| ptr_data_load | offset | 0xFFE4 | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | | | -| ptr_data_copy | destination | 0xFFE3 | - | source | size | 0 | 0 | | staticcall | | | -| ptr_data_size | 0 | 0xFFE2 | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | | | -| active_ptr_swap | index_1 | 0xFFD9 | - | index_2 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | swaps active pointers | | -| const_array_declare | index(constant) | 0xFFE1 | - | size(constant) | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | | | -| const_array_set | index(constant) | 0xFFE0 | - | offset(constant) | 0xFFFF to prevent optimizing out by Yul | value(constant) | 0 | | staticcall | | | -| const_array_finalize | index(constant) | 0xFFDF | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | | | -| const_array_get | index(constant) | 0xFFDE | - | offset | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | | | -| return_forward | 0 | 0xFFDB | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | generates a return forwarding the active pointer | | -| revert_forward | 0 | 0xFFDA | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | generates a revert forwarding the active pointer | | - -### Requirements for calling system contracts - -By default, all system contracts up to the address `0xFFFF` require that the call was done via system call (i.e. `call_flags&2 != 0` . - -**Exceptions:** - -- BOOTLOADER_FORMAL address as the users need to be able to send money there. - -**Meaning of ABI params:** - -- MSG_VALUE_SIMULATOR: `extra_abi_data_1 = value || whether_the_call_is_system`, where || denotes the concatenation, - value should occupy first 128 bits, while `whether_the_call_is_system` is a 1-bit flag that denotes whether the call should be a system call. - `extra_abi_data_2` is the address of the callee. -- No meaning for the rest diff --git a/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/21.extensions/20.verbatim.md b/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/21.extensions/20.verbatim.md deleted file mode 100644 index fdd0c1e6..00000000 --- a/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/21.extensions/20.verbatim.md +++ /dev/null @@ -1,87 +0,0 @@ ---- -title: ZKsync Era Extension Simulation (verbatim) -description: ---- - -::callout{icon="i-heroicons-light-bulb" color="blue"} -NOTES: - -- changed META - it can be used for MSIZE simulation -- setting ergs per pubdata is done by separate opcode now (not part of `near_call`) -- incrementing TX counter is done by separate opcode now (not part of `far_call`) -:: - -Our VM has some opcodes that are not expressible in Solidity, -but we can simulate them on compiler level by abusing “CALL” instruction. -We use 2nd parameter of “CALL” (address) as a marker, and remaining 6 parameters as input parameters -(we use “address”-like field since it’s kind of shorter type, if assembly block cares about types in Solidity). -Unfortunately “CALL” returns only 1 stack parameter, but it looks sufficient for our purposes. - -Please note, that some of the methods don’t modify state, -so STATICCALL instead of CALL should be used for them. -The type of the needed method is indicated in the rightmost column. - -Call types are not validated and do not affect the simulation behavior, -unless specified otherwise, like in `raw_far_call` and `system_call` simulations, -where the call type is passed through. - -For some simulations below we assume that there exist a hidden global pseudo-variable called `ACTIVE_PTR` for manipulations, -since one can not easily load pointer value into Solidity’s variable. - -| Simulated opcode | CALL param 0 (gas) | CALL param 1 (address) | CALL param 2 (value) | CALL param 3 (input offset) | CALL param 4 (input length) | CALL param 5 (output offset) | CALL param 6 (output length) | Return value | call type | LLVM implementation | Motivation | -| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | -| to_l1(is_first, in1, in2) | if_first (bool) | 0xFFFF | in1 (u256) | in2 (u256) | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | _ | call | @llvm.syncvm.tol1(i256 %in1, i256 %in2, i256 %is_first) | Send messages to L1 | -| code_source | 0 | 0xFFFE | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | address | staticcall | @llvm.syncvm.context(i256 %param) ; param == 2 (see SyncVM.h) | Largely to be able to catch “delegatecalls” in system contracts (by comparing this == code_source) | -| precompile(in1, ergs_to_burn, out0) | in1 (u256) | 0xFFFD | - | ergs_to_burn (u32) | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | out0 | staticcall | @llvm.syncvm.precompile(i256 %in1, i256 %ergs) | way to trigger call to precompile in VM | -| decommit(versioned_hash, ergs_to_burn, out0) | versioned_hash (u256) | 0xFFDD | - | ergs_to_burn (u32) | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | out0 | staticcall | saves the result pointer to @ptr_decommit | way to trigger decommit in VM | -| meta | 0 | 0xFFFC | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | [u256 tight packing](https://github.com/matter-labs/EraVM_opcode_defs/blob/b000abebc27f88919e0087b7604b8c71ba5b3daf/src/definitions/abi/meta.rs#L6) | staticcall | @llvm.syncvm.context(i256 %param) ; param == 3 (see SyncVM.h) | way to trigger call to meta information about some small pieces of the state in VM | -| mimic_call(to, abi_data, implicit r5 = who to mimic) | who_to_call | 0xFFFB | 0 | abi_data | who_to_mimic | 0 | 0 | WILL mess up the registers and WILL use r1-r4 for our standard ABI convention and r5 for the extra who_to_mimic argument | any in the code; mimic call in the bytecode | Runtime {i256, i1} __mimiccall(i256, i256, i256, {i256, i1}) | | -| system_mimic_call(to, abi_data, implicit r3, r4, r5 = who to mimic) | who_to_call | 0xFFFA | 0 | abi_data | who_to_mimic | value_to_put_into_r3 | value_to_put_into_r4 | WILL mess up the registers and WILL use r1-r4 for our standard ABI convention and r5 for the extra who_to_mimic argument | any in the code; mimic call in the bytecode | Runtime {i256, i1} __mimiccall(i256, i256, i256, {i256, i1}) | | -| mimic_call_byref(to, ACTIVE_PTR, implicit r5 = who to mimic) | who_to_call | 0xFFF9 | 0 | 0 | who_to_mimic | 0 | 0 | WILL mess up the registers and WILL use r1-r4 for our standard ABI convention and r5 for the extra who_to_mimic argument | any in the code; mimic call in the bytecode | Runtime {i256, i1} __mimiccall(i8 addrspace(3), i256, i256, {i256, i1}) | Same as one above, but takes ABI data from ACTIVE_PTR | -| system_mimic_call_byref(to, ACTIVE_PTR, implicit r3, r4, r5 = who to mimic) | who_to_call | 0xFFF8 | 0 | 0 | who_to_mimic | value_to_put_into_r3 | value_to_put_into_r4 | WILL mess up the registers and WILL use r1-r4 for our standard ABI convention and r5 for the extra who_to_mimic argument | any in the code; mimic call in the bytecode | Runtime {i256, i1} __mimiccall(i8 addrspace(3), i256, i256, {i256, i1}) | Same as one above, but takes ABI data from ACTIVE_PTR | -| raw_far_call | who_to_call | 0xFFF7 | 0 | 0 | abi_data (CAN be with “to system = true”) | output_offset | output_length | Same as for EVM call | call | static | delegate (the call type is preserved)
It’s very similar to “system_call” described below, but for the cases when we only need to have to_system = true set in ABI (responsibility of the user, NOT the compiler), but we do not actually need to pass anything through r3 and r4 (so we can save on setting them or zeroing them, whatever) | -| raw_far_call_byref | who_to_call | 0xFFF6 | 0 | 0 | 0xFFFF to prevent optimizing out by Yul | output_offset | output_length | Same as for EVM call | call | static | delegate (the call type is preserved)
Same as one above, but takes ABI data from ACTIVE_PTR | -| system_call | who_to_call | 0xFFF5 | value_to_put_into_r3 (only for call with 7 arguments) | value_to_put_into_r4 | abi_data (MUST have “to system” set) | value_to_put_into_r5 | value_to_put_into_r6 | Same as for EVM call | call | static | delegate (the call type is preserved)
to call system contracts, like MSG_VALUE_SIMULATOR. We may need 4 different formal definitions for cases when we would want to have integer/ptr in r3 and r4 | -| system_call_byref | who_to_call | 0xFFF4 | value_to_put_into_r3 (only for call with 7 arguments) | value_to_put_into_r4 | 0xFFFF to prevent optimizing out by Yul | value_to_put_into_r5 | value_to_put_into_r6 | Same as for EVM call | call | static | delegate (the call type is preserved)
to call system contracts, like MSG_VALUE_SIMULATOR. We may need 4 different formal definitions for cases when we would want to have integer/ptr in r3 and r4
Same as one above, but takes ABI data from ACTIVE_PTR | -| set_context_u128 | 0 | 0xFFF3 | value | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | - | call | | | -| set_pubdata_price | in1 | 0xFFF2 | 0 | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | - | call | context.set_ergs_per_pubdata in1 in assembly | | -| increment_tx_counter | 0 | 0xFFF1 | 0 | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | - | call | context.inc_tx_num in assembly | | -| ptr_calldata | 0 | 0xFFF0 | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | one passed in r1 on far_call to the callee (save in very first instructions on entry) | Loads as INTEGER! | -| call_flags | 0 | 0xFFEF | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | one passed in r2 on far_call to the callee (save in very first instructions on entry) | | -| ptr_return_data | 0 | 0xFFEE | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | one passed in r1 on return from far_call back to the caller (save in very first instruction in the corresponding branch!) | Loads as INTEGER! | -| event_initialize | in1 | 0xFFED | - | in2 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | call | | | -| event_write | in1 | 0xFFEC | - | in2 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | call | | | -| load_calldata_into_active_ptr | 0 | 0xFFEB | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | loads value of @calldataptr (from r1 at the entry point of the contract into virtual ACTIVE_PTR)ACTIVE_PTR | | -| load_returndata_into_active_ptr | 0 | 0xFFEA | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | loads value of the latest @returndataptr (from the r1 at the point of return from the child into virtual ACTIVE_PTR) | | -| load_decommit_into_active_ptr | 0 | 0xFFDC | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | loads value of the @ptr_decommit into virtual ACTIVE_PTR | | -| ptr_add_into_active | in1 | 0xFFE9 | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | performs ptr.add ACTIVE_PTR, in1, ACTIVE_PTR | | -| ptr_shrink_into_active | in1 | 0xFFE8 | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | performs ptr.shrink ACTIVE_PTR, in1, ACTIVE_PTR | | -| ptr_pack_into_active | in1 | 0xFFE7 | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | performs ptr.pack ACTIVE_PTR, in1, ACTIVE_PTR | | -| multiplication_high | in1 | 0xFFE6 | - | in2 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | Returns the higher register (the overflown part) | staticcall | | | -| extra_abi_data | in1 | 0xFFE5 | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | ones passed in r3-r12 on far_call to the callee (saved in the very first instructions in the entry) | | -| ptr_data_load | offset | 0xFFE4 | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | | | -| ptr_data_copy | destination | 0xFFE3 | - | source | size | 0 | 0 | | staticcall | | | -| ptr_data_size | 0 | 0xFFE2 | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | | | -| active_ptr_swap | index_1 | 0xFFD9 | - | index_2 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | swaps active pointers | | -| const_array_declare | index(constant) | 0xFFE1 | - | size(constant) | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | | | -| const_array_set | index(constant) | 0xFFE0 | - | offset(constant) | 0xFFFF to prevent optimizing out by Yul | value(constant) | 0 | | staticcall | | | -| const_array_finalize | index(constant) | 0xFFDF | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | | | -| const_array_get | index(constant) | 0xFFDE | - | offset | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | | | -| return_forward | 0 | 0xFFDB | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | generates a return forwarding the active pointer | | -| revert_forward | 0 | 0xFFDA | - | 0 | 0xFFFF to prevent optimizing out by Yul | 0 | 0 | | staticcall | generates a revert forwarding the active pointer | | - -### Requirements for calling system contracts - -By default, all system contracts up to the address `0xFFFF` require that the call was done via system call (i.e. `call_flags&2 != 0` . - -**Exceptions:** - -- BOOTLOADER_FORMAL address as the users need to be able to send money there. - -**Meaning of ABI params:** - -- MSG_VALUE_SIMULATOR: `extra_abi_data_1 = value || whether_the_call_is_system`, - where || denotes the concatenation, value should occupy first 128 bits, while `whether_the_call_is_system` is a 1-bit flag - that denotes whether the call should be a system call. - `extra_abi_data_2` is the address of the callee. -- No meaning for the rest. diff --git a/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/21.extensions/_dir.yml b/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/21.extensions/_dir.yml deleted file mode 100644 index c8733bca..00000000 --- a/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/21.extensions/_dir.yml +++ /dev/null @@ -1 +0,0 @@ -title: Extensions diff --git a/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/40.yul.md b/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/40.yul.md index 3d9f2940..5491f530 100644 --- a/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/40.yul.md +++ b/content/20.zksync-protocol/50.compiler/20.specification/60.instructions/40.yul.md @@ -98,7 +98,7 @@ Is a Yul optimizer hint which is not used by our compiler. Instead, its only arg Original [Yul](https://docs.soliditylang.org/en/latest/yul.html#verbatim) auxiliary instruction. Unlike on EVM, on ZKsync VM target this instruction has nothing to do with inserting of EVM bytecode. Instead, it is used to implement -[ZKsync VM Yul Extensions](/zksync-protocol/compiler/specification/instructions#yul-extensions) available in the system mode. -In order to compile a Yul contract with extensions, both Yul and system mode must be enabled (`zksolc --yul --system-mode ...`). +[ZKsync EraVM Yul Extensions](https://matter-labs.github.io/era-compiler-solidity/latest/06-eravm-extensions.html). +In order to compile a Yul contract with extensions, both Yul mode and EraVM extensions must be enabled (`zksolc --yul --enable-eravm-extensions ...`). [The LLVM IR generator code](%%zk_git_repo_era-compiler-solidity%%/blob/main/src/yul/parser/statement/expression/function_call/verbatim.rs).