Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ERC: Cache invalidation in ERC-5219 mode Web3 URL #652

Open
wants to merge 16 commits into
base: master
Choose a base branch
from
178 changes: 178 additions & 0 deletions ERCS/erc-7774.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
---
eip: 7774
title: Cache invalidation in ERC-5219 mode Web3 URL
description: In ERC-5219 resolve mode, add mechanisms to alleviate limitations that prevent the use of RFC 9111 HTTP caching
nand2 marked this conversation as resolved.
Show resolved Hide resolved
author: Nicolas Deschildre (@nand2)
discussions-to: https://ethereum-magicians.org/t/erc-7774-cache-invalidation-in-erc-5219-mode-web3-url/21255
status: Draft
type: Standards Track
category: ERC
created: 2024-09-20
requires: 5219, 6944
---

## Abstract

In the context of the [ERC-6860](./eip-6860.md) `web3://` standard, this ERC extends the [ERC-6944](./eip-6944.md) resolve mode. It introduces mechanisms to address limitations that prevent the use of standard [RFC 9111](https://www.rfc-editor.org/rfc/rfc9111) HTTP caching.

## Motivation

Calls to Ethereum RPC providers are costly—both CPU-wise for local nodes and monetarily for paid external RPC providers. Furthermore, external RPC providers are rate-limited, which can quickly cause disruptions when loading `web3://` URLs.

Therefore, it makes sense to implement caching mechanisms to reduce RPC calls when possible. Since `web3://` aims to be as close to HTTP as possible, leveraging standard [RFC 9111](https://www.rfc-editor.org/rfc/rfc9111) HTTP caching is the natural choice. In the [ERC-6944](./eip-6944.md) resolve mode, smart contracts can already return standard HTTP caching headers like `Cache-Control` and `ETag`.

However, due to the [ERC-6944](./eip-6944.md) resolve mode not forwarding request HTTP headers to the smart contract, smart contracts cannot handle `If-None-Match` and `If-Modified-Since` cache validation headers. Consequently, they are limited to using the `Cache-control: max-age=XX` mechanism, which causes each cache validation request to trigger an RPC call, regenerating the full response.

This ERC proposes a solution to bypass this limitation by allowing websites to broadcast cache invalidations via smart contract events.

Additionally, even if smart contracts could read request HTTP headers, using smart contract events is more efficient, as it moves cache invalidation logic outside the contract.

## Specification

This standard introduces the `evm-events` cache directive for the `Cache-Control` header of request responses, as an extension directive as defined in section 5.2.3 of [RFC 9111](https://www.rfc-editor.org/rfc/rfc9111).

When a [ERC-6944](./eip-6944.md) resolve mode website wants to use event-based caching for a request, it MUST:
nand2 marked this conversation as resolved.
Show resolved Hide resolved

- Include the `evm-events` directive in the `Cache-Control` header of the response.
- Include the `ETag` and/or `Cache-Control` headers in the response, as per traditional [RFC 9111](https://www.rfc-editor.org/rfc/rfc9111) HTTP caching.
- Emit a cache invalidation event (as defined below) for the path in the smart contract when the output of the response changes and it deems cache clearing necessary.

A value to the `evm-events` cache directive is optional, and can be used to specify to listen for additional events on other smart contracts, and/or for other paths. The cache directive value syntax in ABNF notation is :

```
cache-directive-value = [ address-path-absolute *( " " address-path-absolute ) ]
address-path-absolute = [ address ] path-absolute [ "?" query ]
address = "0x" 20( HEXDIG HEXDIG )
path-absolute = <path-absolute, see RFC 3986, Section 3.3>
query = <query, see RFC 3986, Section 3.4>
```

**Examples**:

- `Cache-control: evm-events` : The cache of the page returning this directive will be cleared when the contract having responded to the request emits a cache clearing event for the path of the page having been served.
- `Cache-control: evm-events="/path/path2"` : Same behavior than the first example, but additionally the cache of the page returning this directive will be cleared when the contract having responded to the request emits a cache clearing event for path `/path/path2`.
- `Cache-control: evm-events="0xe4ba0e245436b737468c206ab5c8f4950597ab7f/path/path2"` : Same behavior than the first example, but additionally the cache of the page returning this directive will be cleared when the contract `0xe4ba0e245436b737468c206ab5c8f4950597ab7f` emits a cache clearing event for path `/path/path2`.
- `Cache-control: evm-events="0xe4ba0e245436b737468c206ab5c8f4950597ab7f"` : Same behavior than the first example, but additionally the cache of the page returning this directive will be cleared when the contract `0xe4ba0e245436b737468c206ab5c8f4950597ab7f` emits a cache clearing event for the path of the page having been served.
- `Cache-control: evm-events="/path/path2 /path/path3"` : Same behavior than the first example, but additionally the cache of the page returning this directive will be cleared when the contract having responded to the request emits a cache clearing event for path `/path/path2` or `/path/path3`.

### Cache invalidation event

The event is defined as:

```
event ClearPathCache(string[] paths);
```

This event clears the cache for an array of `paths`. Each `path` refers to the `pathQuery` part of the ABNF definition in [ERC-6860](./eip-6860.md).

- A `path` MUST NOT end with a `/`, except when the whole path is the root path, which is `/`.
- Two `paths` are considered identical if they have the same [ERC-5219](./eip-5219.md) resource entries and their parameter values match, regardless of the order.

**Example**:
- `/test?a=1&b=2` and `/test?b=2&a=1` are considered identical.

#### Wildcard usage

`paths` may contain `*` wildcards, with the following rules:

1. **Wildcards in Resource Entries**:
- A wildcard (`*`) can be used on its own in an [ERC-5219](./eip-5219.md) resource entry.
- A wildcard CANNOT be combined with other characters in the same entry; if it is, the `path` will be ignored.
- A wildcard requires at least one character to match.

**Examples**:
- `/*` will match `/test` but not `/test/abc` and not `/`.
- `/test/*` will match `/test/abc` but will not match `/test/` or `/test/abc/def`.
- `/*/abc` will match `/test/abc`, but not `//abc`.
- `/t*t` is invalid, so the `path` will be ignored.

2. **Wildcards in Parameter Values**:
- A wildcard can be used alone as a parameter value.
- A wildcard CANNOT be combined with other characters in the parameter value, or the `path` will be ignored.
- A wildcard in parameter values requires at least one character to match.

**Examples**:
- `/abc?a=*` will match `/abc?a=zz` but not `/abc?a=` or `/abc?a=zz&b=cc`.
- `/abc?a=*&b=*` will match `/abc?a=1&b=2` and `/abc?b=2&a=1`.
- `/abc?a=z*` is invalid, so the `path` will be ignored.

3. **Special Case: Global Wildcard**:
- A `path` containing only a `*` will match every path within the smart contract.

Wildcards are intentionally limited to these simple cases to facilitate efficient path lookup implementations.

### Caching behavior

#### Cache Invalidation States for `web3://` Clients

A `web3://` client can be in one of two cache invalidation states for each chain and smart contract:

1. **Listening for Events**:
The `web3://` client MUST listen for the cache invalidation events defined earlier and should aim to stay as close to real-time as possible.

2. **Not Listening for Events**:
This is the default state when this ERC is not implemented. In this state, the `web3://` client ignores all HTTP caching validation requests (e.g., `If-None-Match`, `If-Modified-Since` request headers).

The `web3://` client can switch between these states at any time and MAY implement heuristics to optimize the use of RPC providers by switching states as appropriate.

#### Cache Key-Value Mapping

The `web3://` client maintains a key-value mapping for caching, which MUST be cleared whenever it transitions from "Listening for Events" to "Not Listening for Events." The mapping is structured as follows:

```
mapping(
(<chain id>, <contract address>, <ERC-6860 pathQuery>)
=>
(<last modified date>, <ETag>)
)
```

Additional elements can be included in the mapping key when necessary. For example, [ERC-7618](./eip-7618.md) requires the inclusion of the `Accept-Encoding` request header in the mapping key.

#### Handling Requests in "Listening for Events" State

When a request is received in the "Listening for Events" state:

1. **If no mapping entry exists**:
- The `web3://` client queries the smart contract.
- If the response includes the `evm-events` directive in the `Cache-Control` header and an `ETag`, a mapping entry is created using the `ETag`.
- If the response contains the `evm-events` directive and a `max-age=XX` directive in the `Cache-Control` header, the mapping entry is created with the `last modified date`, determined in the following order of priority:
- The `Last-Modified` header, if present.
- The `Date` header, if present.
- Otherwise, the block date when the smart contract was queried.
- If the response includes both an `ETag` and a `Cache-Control: evm-events max-age=XX` directive, a single mapping entry is created containing both the `ETag` and the `last modified date`.

2. **If a mapping entry exists**:
- If the request contains a valid `If-None-Match` header:
- If the `ETag` in the mapping matches the `If-None-Match` value, the `web3://` client returns a `304 Not Modified` response immediately.
- If the `ETag` does not match, the client queries the smart contract, deletes the mapping entry, and processes the request as if no mapping entry existed.

- If the request contains a valid `If-Modified-Since` header:
- If the `last modified date` in the mapping is earlier than the `If-Modified-Since` date, the client returns a `304 Not Modified` response immediately.
- Otherwise, the client queries the smart contract, deletes the mapping entry, and processes the request as if no mapping entry existed.

- If the request contains neither `If-None-Match` nor `If-Modified-Since` headers (or they are invalid):
- The client queries the smart contract, deletes the mapping entry, and processes the request as if no mapping entry existed.



#### Cache Invalidation via Blockchain Events

In the "Listening for Events" state, the `web3://` client listens to the blockchain for the cache invalidation events defined in the previous section. For each path match, it deletes the corresponding mapping entry.


## Rationale

We add this feature to the [ERC-6944](./eip-6944.md) resolve mode because it can be added without changes the interface.
nand2 marked this conversation as resolved.
Show resolved Hide resolved

To stay as close as possible to standard HTTP, we reuse the HTTP caching mechanism headers.

The use of the `evm-events` directive is necessary to avoid a situation where a website uses traditional [RFC 9111](https://www.rfc-editor.org/rfc/rfc9111) HTTP caching headers, but does not implement this ERC by failing to emit the events. In such cases, `web3://` clients implementing this ERC would serve stale content for that website indefinitely.
nand2 marked this conversation as resolved.
Show resolved Hide resolved

## Security Considerations

No security considerations were found.
Copy link
Collaborator

@SamWilsn SamWilsn Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are definitely some security considerations, no? At the very least contracts need to correctly emit cache invalidation events, or stale content could be served.

I bet there are some unpleasant interactions between this and chain reorgs. as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes indeed!
I have added 3 :

  • Stale content will be served during the delay between a user transaction emitting a cache clearing event, and the web3:// client picking and processing the event.
  • For each cached page, websites must properly implement cache invalidation events; otherwise, stale content will be served indefinitely.
  • In the event of a chain reorganization, the web3:// client must roll back its caching state, or reverted content will be served until the next cache clearing event.

The first one is something that conceptually annoys me (even if the web3:// client is optimizing its logs reading by using streams, it can theorically be missed).
So I am thinking about ways for the website to signal to the web3:// client when fetching a page: "hey please ensure you are up to date with block XXX (because I just made a change on it via a TX)".
A HTTP header in the request is an idea, but may be not practical when doing a page change (e.g. in blog post edit page, save tx, then go to blog post page view). Thinking...


## Copyright

Copyright and related rights waived via [CC0](../LICENSE.md).
Loading