-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Goal: Validator Monitoring #443
Comments
Hey, @zolotokrylin, can you also verify and include this goal with the correct priority on the roadmap? Please share if you think we need more information to evaluate the business value for it correctly 🙏 |
@brennanjl - Can you confirm that CometBFT block data is available via the Indexer? |
I'm actually not sure what this means. It is mostly presumed that a node operator is supporting the network 100% of the time; if at any point in time <=2/3rds of the validating power is not running, the network will halt. Are you simply looking to track for how long a certain validator has been a validator?
The full block data is not (this can be read from a node directly), but indexed block metadata such as proposer can be queried from the indexer. |
If I'm correctly aligned, the scenario is that there will soon be 12 node operators running TSN. Even if they are all registered as validators, but 2 of them (less than 1/3rds) remain disconnected for days in a month or are inconsistent, we should have easy ways to track it |
@outerlook - basically clarified. We want to index the CometBFT blocks to determine which Validators are participating on each block.
In this case, we could create (or use an existing) CometBFT indexer - correct? |
Confirmed by @brennanjl The indexer does not support this. How high priority is this? - @zolotokrylin to determine priority of this. Especially as related to #415. |
@rsoury, if you are not busy with anything else, please define the Spec document for this Goal. |
@zolotokrylin - The Spec for this has been merged with #415, and is already established. The idea of Observability whether for internal and external analysis is essentially a single Goal. |
@rsoury, could you please remove (or merge into the Specs doc if still relevant) everything from the description of this task and attach the relevant Spec file here? |
Yes, it's referenced under the Validator Monitoring: https://docs.google.com/document/d/1-yxCyunqLhIHqLGJrIqScqRduo_lB3Ee6LGyeyY4B3A/edit#heading=h.bjrhx35jayz0 It's distinguished quite clearly. Where blockchain consensus data is a source for observability and reliability, it covers this issue associated to Validator Monitoring. |
@outerlook is this goal a duplicate of |
@markholdex no. This Goal is about understanding validators' performance. |
@zolotokrylin but in the Reliability, there are problems and specs around the performance and penalty for validators that perform poorly. So It's confusing me or maybe I don't get something. |
@markholdex, along the way, per #443 (comment) I see it was merged in the process. I previously saw #415 as an individual level reliability issue (are our nodes operating well? are they contributing to the network?) and this goal as a network level monitoring (which nodes aren't contributing?) they are very related and are overlapping in some ways. We can:
|
@markholdex, feel free to optimise the naming if you need it. |
@outerlook I believe that:
|
Partially. I initially thought it's a simpler step for #415 goal to emit their own data about contribution (already available) That goal, to be simpler, would answer: "how is my node contributing to the network?" and have alarms for it as its our own responsibility to maintain it, and know when we messed up. It seems to be simpler (almost free) than the next issue: "How are all nodes contributing to the network?" is what I thought about this (#443) goal. It has a little more setup because it needs an indexer-like behavior collecting blocks, getting the node list that was supposed to contribute, and emitting a metric for each if it contributed. Maybe this will get easier or less relevant after #415 But again, this is what I understood, and it made sense. However, I'm ok with evaluating the need again after #415 -- if it's really easy to assess other nodes contribution within those tasks, I sure will to avoid more effort here another point of view: would #443 already cover what we ask about validation on #415? yes, but #443 is harder, and #415 just needs what is already done on cometbft available metrics |
Objective
Assess the participation of validators (node operators) in the network during normal operations.
Description
For example, a visual representation of this could be a graph showing how many blocks a day a certain operator was also supporting the network.
The mechanism includes a node's signature that gets validated, and then the indexer exposes its public key on block information after the consensus.
Probably, the internal cometBFT endpoint already supports that: https://github.com/cometbft/cometbft/blob/v0.38.x/spec/core/data_structures.md
Then, what is the easiest path to expose this data from the kwil indexer? Should we expose a node cometBFT API endpoint?
To Do
Define what is the scope of this goal:
Problems
Blocked By
Instrumentation
Validator monitoring
The text was updated successfully, but these errors were encountered: