Skip to content

Expose Prometheus metrics around block production #508

Open
@ruuda

Description

@ruuda

There are a few things about block production that would be interesting to have per-validator metrics on:

  • The skip rate (blocks produced, out of slots assigned in the leader schedule)
  • Transaction fees received per block (so we have a better view of whether validators are profitable)

I’m thinking, we might add a separate thread to the maintainer daemon that:

  • Keeps per-validator counters: slots_assigned_total, blocks_produced_total, transaction_fee_lamports_total.
  • Fetches the leader schedule
  • In the main loop, if the current slot height passed over slots where one of the Lido validators was a leader:
    • Increment slots_assigned for those slots.
    • Call getBlock for those slots to see if the block was produced. If so, increment blocks_produced, and also add its transaction fees to transaction_fee_lamports_total.
  • Expose that info as per-validator Prometheus metrics

Then we can compute:

  • The skip rate over any given time period: 1 - rate(blocks_produced_total[30d]) / rate(slots_assigned_total[30d])
  • Average tx fee per block: sum(rate(transaction_fee_lamports_total[30d])) / sum(rate(slots_assigned_total[30d]))

This will miss the info for blocks that were produced while the daemon was not running. We could go further back in history, but I don’t see a way to reconcile a stateless deamon with that. For one, the RPC can only return the leader schedule for the current epoch, so we’d need to save the schedule to have access later. And also, we would need to save the counter values and turn them into gauges, to avoid double-counting at restart.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions