Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Randao and Builder Proposer Failures #266

Closed
4 tasks
boulder225 opened this issue May 5, 2024 · 2 comments
Closed
4 tasks

Randao and Builder Proposer Failures #266

boulder225 opened this issue May 5, 2024 · 2 comments
Assignees
Labels
protocol protocol team tickets

Comments

@boulder225
Copy link

boulder225 commented May 5, 2024

🎯 Problem to be solved

Three Lido clusters (Duck, Elk, and Kinkajou) on mainnet are experiencing problems with randao duties and builder proposer failures. One cluster is slightly different from the other two, where it marks the randao duty as failed, while the others simply state that the builder proposer failed.

https://grafana.monitoring.gcp.obol.tech/d/d6qujIJVk/cluster-overview-mainnet?from=now-2d&orgId=8&to=now&var-cluster_hash=0131dd1&var-cluster_name=Lido+x+Obol%3A+Knightly+Kinkajou&var-cluster_network=mainnet&var-cluster_peer=alive-fish&var-duty=All&var-interval=%24__auto_interval_interval&var-job=charon&var-peer=All

image.png

image.png

image.png

🛠️ Proposed solution

  • Investigate the logs and metrics to identify the root cause of the randao and builder proposer failures.
  • Check for any recent updates or configurations that might be causing the issue.
  • Verify the connectivity and health of the clusters' nodes
  • Review the randao and proposer settings and configurations

Update

The issue seems related to produce block v3, as this endpoint needs to be turned off until v1 is released.

@github-actions github-actions bot added the protocol protocol team tickets label May 5, 2024
@boulder225
Copy link
Author

ObolNetwork#63

@pinebit
Copy link
Contributor

pinebit commented May 5, 2024

A quick investigation done by @OisinKyne @KaloyanTanev and @pinebit led to the conclusion that due to the version bump for lodestar (1.16 => 1.17), we have to enforce --useProduceBlockV3=false flag, otherwise in according to the code - it will be true (for deneb+).
Because we don't receive logs for the affected clusters, we could not prove this 100%.
We all agreed the fix proposed by @OisinKyne ObolNetwork#63 is legit and should help resolving the issue.

@pinebit pinebit self-assigned this May 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
protocol protocol team tickets
Projects
None yet
Development

No branches or pull requests

3 participants