Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cnidarium: prefix queries over substores are hazardous #4653

Merged
merged 7 commits into from
Jun 24, 2024

Conversation

erwanor
Copy link
Member

@erwanor erwanor commented Jun 24, 2024

Describe your changes

This PR contains a minimal reproduction for a bug in cnidarium's prefix query handling. It also contains a sketch for a fix that we can workshop. The bug was introduced in the original substore implementation PR (#3131).

Minimal reproduction of the prefix range cache bug.

Context
cnidarium, our storage layer, supports prefix storage.
This allows users to configure independent storage units, each with
their own merkle tree, nonverifiable sidecar, and separate namespace.
Routing is done transparently without the user having to worry about
the details.

Overview
Prefix queries return tuples of (key, value)s, but instead of
returning the full key, they return the substore key. This is a layering
violation, and indeed causes a bug in the cache interleaving logic.

Terminology

  • a full_key: a key that contains a substore prefix, a delimiter, and a substore key.
  • a substore_key: a key with a stripped prefix.

Walkthrough
StateDelta index changes using full keys, as it is not aware of the
particular substore configuration that it is working against, by design.
As part of the cache interleaving logic, the StateDetla will try look for
new writes or covering deletions. However, since the base prefix implementation
returns substore keys, the cache will build an incoherence range and panic (or miss data).

Checklist before requesting a review

  • If this code contains consensus-breaking changes, I have added the "consensus-breaking" label. Otherwise, I declare my belief that there are not consensus-breaking changes, for the following reason:

    Consensus breaking in the sense that the chain won't halt if we hit this.

The bug is caused by a layering violation: keys returned by prefix queries have their *substore prefix* truncated. However, `StateDelta` are unaware of this implementation detail and maintain a global namespace for all changes.

This create an issue in the cache interleaving logic, where the search range that we construct to look for new writes/covering deletions between keys will build a nonsensical range, for example using the full key (incl. substore prefix) as a lower bound and a susbtore key (with a truncated prefix).

The effect of this bug can vary from a panic, to skipping valid entries that should be returned by the prefix query.
@erwanor erwanor added A-node Area: System design and implementation for node software C-bug Category: a bug labels Jun 24, 2024
@erwanor erwanor changed the title cnidarium: prefix queries over substores are broken cnidarium: prefix queries over substores are hazardous Jun 24, 2024
@avahowell
Copy link
Contributor

This should also change the migration in 78 for deleting empty ibc commitments right?

@erwanor
Copy link
Member Author

erwanor commented Jun 24, 2024

That's right, we should update it once this gets merged

@erwanor erwanor marked this pull request as ready for review June 24, 2024 19:05
@aubrika aubrika requested a review from cratelyn June 24, 2024 20:10
@aubrika
Copy link
Contributor

aubrika commented Jun 24, 2024

This should also change the migration in 78 for deleting empty ibc commitments right?

@cratelyn @erwanor let's include the migration fix in this PR

@cratelyn cratelyn added the consensus-breaking breaking change to execution of on-chain data label Jun 24, 2024
@cratelyn
Copy link
Contributor

added the consensus-breaking label, per the description.

Copy link
Contributor

@cratelyn cratelyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️ this looks good. approved, pending #4653 (comment) being addressed

@erwanor erwanor merged commit 67511d8 into main Jun 24, 2024
13 checks passed
@erwanor erwanor deleted the erwan/cnidarium_prefix_fix branch June 24, 2024 21:56
@erwanor erwanor self-assigned this Jun 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-node Area: System design and implementation for node software C-bug Category: a bug consensus-breaking breaking change to execution of on-chain data
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants