Iterators and prefix deletions #159

ilblackdragon · 2021-02-19T06:53:35Z

ilblackdragon
Feb 19, 2021
Maintainer

We have had iterators in the runtime before but it was deprecated and removed.

The main reasoning at the time was that there is no way to create a challenge state data structure.
I'm honestly still lost there why that is so, and would like to get this reasoning recorded here for future reference.

Let's say there are items in the collection are ["xxx1", "xxx2", "xyy1", "xz"] and there is a prefix iteration over "x" items that get stopped after 2 elements. My understanding is that the concern is that one can pretend that the collection has only ["xxx1", "xxx2"] elements which the program did iterate over.

But generating a stateless proof for that will not work, as the merkle path would not be the same.

Separately, I think prefix deletion is pretty useful primitive - from implementing migrations (removing an old collection) to managing more advanced collections (like what EVM contract is doing). Prefix deletion doesn't require anything specific on the stateless challenge side - as it's just requires path from root to the common suffix and replacement of item in the branch.

ilblackdragon · 2021-02-19T13:46:35Z

ilblackdragon
Feb 19, 2021
Maintainer Author

Just as a suggestion how to implement the collection where subset can be deleted without adding too much overhead:

struct XYZ {
  data: LookupMap<X, Y> // your normal data
  links: LookupMap<X, (X, X)>
  last_link: X
}

fn insert(x, y) {
   links.insert(x, (lask_link, null))
   links.insert(lask_link, (links.get(lask_link).0, x))
   data.insert(x, y)
   last_link = x
}

fn delete(x) {
  let link = links.get(x);
  if (link.0 != null)  links.insert(link.0, (links.get(link.0).0, link.1));
  if (link.1 != null) links.insert(link.1, (link.0, links.get(link.1).1));
  data.remove(x)
}

fn delete_all() {
   let mut link = last_link.0;
   while link != null {
       data.remove(link);
       link = links.remove(link).0;
   }
}

1 reply

bowenwang1996 Feb 19, 2021
Maintainer

It feels to me that you are reinventing linked hashmap :) Also can we consolidate this with #160?

MaksymZavershynskyi · 2021-02-19T19:58:34Z

MaksymZavershynskyi
Feb 19, 2021

Let me address the challenges/state witnesses down below. First, I would like to emphasize that keeping a simple state interface is potentially an even more important argument than the argument about challenges/state witness.

Minimal State Interface

I would classify all key-value interfaces into: minimal and extended. Minimal key-value interfaces only allow random key-value lookup/read/write/deletion. Extended key-value interfaces additionally have features like: iterators, range queries, prefix-based operations. With blockchain, once we allow contracts that use features of extended key-value interfaces, we cannot remove these features ever, because contracts are the only permanent non-upgradable thing on the blockchain (with few very specific exceptions, like staking pools). We would like to reserve an opportunity to change our state design in the future, and potentially move from MPT to AVL trees, Urkel, or variations of binary trees that @abacabadabacaba has in mind. This will allow us to greatly increase TPS on a single shard improving our core value proposition.

Unfortunately, extended key-value interface can prohibit us from iterating over the state design.

The cost of extended key-value operations can be drastically different depending on the state implementation. Since extended key-value operations are less commonly used than minimal operations it would make sense that in the future we will pick state design that optimizes for minimal operations at expense of extended key-value operations. Currently, we try to keep our fee estimation as close to the real cost as possible, which means if we later try switching to a different state design that makes extended key-value operations more expensive we won't be able to do it. (We have very little room for maneuvers when it comes to increasing cost of fees without breaking existing contracts. Specifically, right now we allow 500Tgas for receipt execution in a block and limit contract execution to 200Tgas. This means we can increase our fees cost at most 2.5 times in total ever. We would do it by shrinking the block until block allows 200Tgas for receipt execution. We cannot shrink it further, or else the contracts that currently attach 200Tgas for cross-contract calls with start failing.)
Some state designs might not support iterators, range queries, prefix-based operations, because they might be arranging key-value entries differently. For example, in MPT the same-prefix keys belong to the same subtree and therefore it is easy to do prefix-based deletion at low CPU cost. It is not the case for AVL that needs rebalancing. We might even find an implementation of state that completely prohibits iterators for the sake of extremely efficient random read/writes.

Besides performance there are other reasons to switch from MPT, e.g. we don't want shard users to be able to manipulate the MPT depth and cause extreme fee cost to some contracts (e.g. a competitor exchange). AVL tree, for example, would prevent such abuse as suggested by @abacabadabacaba . While having extended key-value interfaces is undoubtedly convenient for the contracts, I don't think this convenience justifies missing a potential opportunity to drastically increase our TPS in the future, or being able to have non-abusable state structure.

Challenges/state witnesses

Suppose a validator X receives a challenge from validator Y about chunk on shard A being incorrectly produced. Validator wants to check the validity of the challenge without downloading the state of shard A, so it relies on "state witness" (a subset of state) attached to the challenge to validate the claim of incorrectness. As validator X is verifying the state there could be only two possibilities:

The challenge is valid and the chunk is invalid;
The challenge is invalid and the chunk is either valid or invalid;

Suppose chunk contains contract execution that iterates over the keys prefixed with x. The state witness in the challenge contains only keys: ["xxx1", "xxx2"]. However, when validator X is executing the challenge using the provided state witness it turns out that contract does not stop iteration after two keys and attempts to iterate further. However, since there are no keys after xxx2 attached in the state witness validator X assumes that that's where the contract should stop. After validator X executes the entire challenge it notices that the state root that they have produced does not match the state root in the challenge signed by the validator of shard A. This can mean either of two things:

Validator of shard A produced a valid chunk. There are actually more keys with x prefix: ["xxx1", "xxx2", "xyy1", "xz"]. The valid contract execution should've continued iterating and not stopping on xxx2. The validator Y that produced a challenge however, did not include "xyy1", "xz" and included only "xxx1", "xxx2", resulting in an invalid challenge. The state roots do not match because the challenge is not valid. This challenge needs to be ignored or used to slash validator Y;
Validator of shard A produced an invalid chunk. The state root that they signed is not the actual state root that they received during chunk execution. Validator Y correctly noticed it and constructed a challenge. In this challenge, state witness only includes keys "xyy1", "xz", because these are the only keys with prefix x that exist on the shard. Validator X executes the challenge and indeed verifies that state root does not match. This challenge needs to be accepted and the chunk producer needs to be slashed.

Unfortunately, from the perspective of validator X the above two use cases are indistinguishable so it does not know whether to slash the chunk producer or to reject the challenge.

This however, is not fundamentally unsolvable problem. In case of prefix iterators, we can require challenges to always include one more node that comes next after the iterator stops, which should be sufficient for the challenge to be self-verifiable. However, this adds complexity to the challenges, because it will require careful special-casing in PartialStorage implementation, and any incorrectness will lead to the wrong user being slashed. Other extended operations, like prefix deletion, might require their own special-casing, which will increase the complexity further.

Proposed approach

It seems like the main motivation for prefix-deletion is about reducing the cost of Ethereum contract deletion in EVM so that it is as low as other NEAR EVM operations. However, I think at the beginning, the primary metric by which people will judge NEAR EVM performance will be contract deployments, standard contract calls, and not contract deletions.

We can shelve this problem, until we settle with a new state design in 1+ year from now. For now we can either emulate prefix-deletion either the way near-sdk-rs emulated iterators or using linked hashmap that @ilblackdragon proposed but inside Wasm. This will increase the cost of deleting Ethereum contracts in EVM, but it won't matter for most of the contracts. Once we settle with new permanent state design, we can upgrade Wasm part of NEAR EVM to use native prefix deletion instead of emulation (assuming the new state design will allow prefix-deletion primitive).

2 replies

ilblackdragon Feb 21, 2021
Maintainer Author

I would say main usage for prefix iteration / deletion is actually collection migrations when contracts are upgraded. We haven't met this because our upgrades so far have been minimal.

That said, in every case it can be worked around with some additional black magic.

For EVM case, I really like @abacabadabacaba's elegant idea below, which also can allow us to reduce gas usage at the moment and offload deletion to third party.

evgenykuzyakov Feb 22, 2021
Collaborator

I would say main usage for prefix iteration / deletion is actually collection migrations when contracts are upgraded. We haven't met this because our upgrades so far have been minimal.

Bulk migrations are likely impossible due to potential large size of the collection.

But, I actually did a migration on the berryclub contract. Moving from Vector to LookupMap: evgenykuzyakov/berryclub@d78491b#diff-66c8dc994f6c6f72d785e779f641970249c99b5d690ed6d3cba21f35637d8c9eR169

The way it works is you first attempt to read from the new collection and then fallback to the read from the previous collection. If the previous collection read succeeds, then you delete it at write time and write into new collection.

abacabadabacaba · 2021-02-19T20:19:38Z

abacabadabacaba
Feb 19, 2021

I will post here a proposal that I had during EVM work group, regarding the way we can implement contract deletion in EVM without relying on iterators. I am assuming that each EVM storage entry is mapped to a NEAR storage entry. In that entry's value, I propose to store generation number in addition to the Ethereum value. This number is initially zero, but is incremented each time a contract is redeployed at the same address (which is possible by using CREATE2 opcode). Only the entries with the current generation number are considered when accessing the storage. We may optionally add a function that anyone may call to purge the stale entries and reclaim the storage stake.

2 replies

MaksymZavershynskyi Feb 21, 2021

This adds extra game-theoretic level to EVM with actors claiming storage. What is the reason for not modeling prefix deletions like we do in near-sdk-rs or using linked hashmap implemented in Wasm?

abacabadabacaba Feb 23, 2021

@nearmax Linked map requires three writes to add/delete an entry, while my solution requires only one. Also, the EVM contract may not be able to clear the entire storage when deleting a contract (it may require too much gas), so some form of lazy deletion is needed anyway.

ilblackdragon · 2021-02-22T06:06:07Z

ilblackdragon
Feb 22, 2021
Maintainer Author

Performance. Currently some things already take 100 Tg (for still unclear reason to me), so if we also hit a bit deletion process it may not fit into a single action limit.

…

On Sun, Feb 21, 2021 at 3:25 PM Maksym Zavershynskyi < ***@***.***> wrote: This adds extra game-theoretic level to EVM with actors claiming storage. What is the reason for not modeling prefix deletions like we do in near-sdk-rs or using linked hashmap implemented in Wasm? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#159 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABK27T5QI5MQN6CR6FZCULTAGI7FANCNFSM4X32K6PA> .

-- Best regards, Illia Polosukhin

1 reply

evgenykuzyakov Feb 22, 2021
Collaborator

It's because of TreeMap usage. There are no need to use TreeMaps for anything in EVM.
UnorderedMap is Rust right now has 3 storage_writes per write and 2 storage_read for reads. Can be improved to 2 storage_writes and 2 reads if we join vectors of keys and values into on Vector<(key, value)>. It's still twice as expensive than rely on delete by prefix.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Iterators and prefix deletions #159

{{title}}

Replies: 4 comments 6 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Iterators and prefix deletions #159

ilblackdragon Feb 19, 2021 Maintainer

Replies: 4 comments · 6 replies

ilblackdragon Feb 19, 2021 Maintainer Author

bowenwang1996 Feb 19, 2021 Maintainer

MaksymZavershynskyi Feb 19, 2021

Minimal State Interface

Challenges/state witnesses

Proposed approach

ilblackdragon Feb 21, 2021 Maintainer Author

evgenykuzyakov Feb 22, 2021 Collaborator

abacabadabacaba Feb 19, 2021

MaksymZavershynskyi Feb 21, 2021

abacabadabacaba Feb 23, 2021

ilblackdragon Feb 22, 2021 Maintainer Author

evgenykuzyakov Feb 22, 2021 Collaborator

ilblackdragon
Feb 19, 2021
Maintainer

Replies: 4 comments 6 replies

ilblackdragon
Feb 19, 2021
Maintainer Author

bowenwang1996 Feb 19, 2021
Maintainer

MaksymZavershynskyi
Feb 19, 2021

ilblackdragon Feb 21, 2021
Maintainer Author

evgenykuzyakov Feb 22, 2021
Collaborator

abacabadabacaba
Feb 19, 2021

ilblackdragon
Feb 22, 2021
Maintainer Author

evgenykuzyakov Feb 22, 2021
Collaborator