Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a minimal but extensible indexer #4573

Closed
wants to merge 18 commits into from
Closed

Add a minimal but extensible indexer #4573

wants to merge 18 commits into from

Conversation

hdevalence
Copy link
Member

@hdevalence hdevalence commented Jun 7, 2024

Describe your changes

Adds the skeleton of a minimal but extensible indexer. The indexer consumes events from a source (raw event) database and writes to a destination database with compiled / processed tables. The indexer is a library that other binaries import and slot their indexes into.

The cometindex library is intended to be generic over any cometbft events. The pindexer binary is for penumbra specifically and we can add all our own code to it. This structure mirrors the existing distinction between comet (generic) and the app. The code is intended to be shaped so that it's easy for users to plug in additional indexes. In particular, all of the "meat" of the binary is in the cometindex::Indexer type, which builds an indexer and handles all arg parsing. This way, the external interface of the program, how it's invoked, etc, is the same no matter what app views are present, and orchestration code like containers doesn't need to be concerned with them.

Before merging:

  • Rename the Index trait to AppView (it's a view of some application data)
  • Rename the create_tables method to init_chain
  • Add polling for ingesting / processing new events after the initial catch-up (500ms?)
  • Make sure the pindexer AppView traits are public so that pindexer can be pulled in as a library
  • Add an extension trait in the pindexer crate that adds an Indexer::with_default_penumbra_app_views() method that hooks in all the AppView impls, so we can add to it as we fill in impls

Followup:

  • Change init_chain to take a serde_json::Value and add plumbing to the Indexer builder to take a genesis.json file in a standardized way so that every instance of the binary doesn't need to implement that -- ideally this is the exact genesis.json that was used by comet, and the plumbing in the Indexer builder pulls out the app_state key and saves it to be able to pass it into each AppView. Longer term we could explore other options but for now I think it's simplest if we just make it a required parameter every time.

Closes #4604

Checklist before requesting a review

  • If this code contains consensus-breaking changes, I have added the "consensus-breaking" label. Otherwise, I declare my belief that there are not consensus-breaking changes, for the following reason:

    New tool outside of consensus

@hdevalence
Copy link
Member Author

Example:

cargo run --bin pindexer -- -s "postgresql://localhost:5432/testnet_raw?sslmode=disable" -d "postgresql://localhost:5432/testnet_compiled?sslmode=disable"

psql

testnet_compiled=# SELECT COUNT(*) FROM shielded_pool_fmd_clue_set;
 count
-------
 27198
(1 row)

testnet_compiled=# SELECT * FROM shielded_pool_fmd_clue_set LIMIT 5;
 id |                                                                 clue_bytes                                                                 |                              tx_hash
----+--------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------
  1 | \x10caf9cfe8554fcdbe15ed817bd24c2ac8cd5ee7aca4dfd0262149e5196e4a05dc64809bba86fc45fd4a43e3e19fb1a7055bdf24459762d44755a16976b36e0100000000 | \xcf47123f5180a041b7500871a1b7a7f21dd86cfe0d51e35d3aad7ea12f22a07c
  2 | \xa8384ea099f9b485955658faaa5be55070f55410d229f7511c94a1e0d778eb0be08a2d5cde1a1d472fc5ed9eb3d26f410ac0b0ef721f5f604b20f89d3cced10100000000 | \xcf47123f5180a041b7500871a1b7a7f21dd86cfe0d51e35d3aad7ea12f22a07c
  3 | \x6a123e795f69f6a1d9af4c9f1c9db560704570decb4b9897abd18097c016e001e1e95f0f50c597a11ca2b397c8a5e367744488c3d9a558b22c8eeb582e7c8a0100000000 | \x87adfcd7959c6409c863e05fa396230b27a28bc1ce9651a4b8f1cc064554a91b
  4 | \x5abd8f40e7bff719098cbb1d1f64893158e839fc531362de44877663494d3808df54394cf2d7be48c4a1d3b3e8695f7ad6dd461bfdf5beeaebee2e00ef23d00000000000 | \x87adfcd7959c6409c863e05fa396230b27a28bc1ce9651a4b8f1cc064554a91b
  5 | \xb67786a00010a6ddc3ccbd29fa293cc2d7b5810d28639a8f902a5ea8885835105cf5bcd1105d36c1b2ddbb3f4d78985c7c6b0ffb5fc4c7bf3074fd9e9a3a6c0200000000 | \x8aa895767b2121b60cfbea8566a98517435a33b2125d769f1a92e4982aac8f54
(5 rows)

@hdevalence
Copy link
Member Author

cc @plaidfinch @philipjames44

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/cometbft/cometbft/blob/main/state/indexer/sink/psql/schema.sql

it'd be helpful to add a link noting that this is not a scheme we've come up with ourselves, but is drawn from cometbft itself, here. i imagine when we upgrade to newer versions of cometbft, this might someday need to be updated.

Copy link
Contributor

@cratelyn cratelyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd like @plaidfinch to have a chance to sign off on this before i give it a formal approval, but this looks good to me. it looks like a great foundation for us to build additional indexing facilities with.

✔️ ✔️ ✔️

@cratelyn cratelyn added the A-indexing Area: Relates to event indexing. label Jun 10, 2024
@aubrika aubrika mentioned this pull request Jun 12, 2024
@cratelyn cratelyn self-assigned this Jun 13, 2024
@cratelyn cratelyn added this to the Sprint 8 milestone Jun 13, 2024
@cratelyn
Copy link
Contributor

i have implemented the changes outlined in the to-do list, and confirmed that polling works correctly by running this against a local devnet. pending CI, this should now be ready for a review!

@cratelyn
Copy link
Contributor

closing in favor of #4610 ➡️

@cratelyn
Copy link
Contributor

reset the cometindex branch on my way out, with #4610 now open 🫡

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-indexing Area: Relates to event indexing.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Deliver minimal indexer
3 participants