Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

erigon attestations reward computation still problematic with latest patches on top of 3.0.0-alpha3 #12093

Open
errge opened this issue Sep 25, 2024 · 0 comments

Comments

@errge
Copy link
Contributor

errge commented Sep 25, 2024

Even with current patches from PR #12073 and PR #12076, I saw this today:

+--------+------+--------+--------+------------+
| epoch  | head | source | target | inactivity |
+--------+------+--------+--------+------------+
| 313669 | 2309 | 2408   | 4473   | 0          |
+--------+------+--------+--------+------------+

The real rewards (e.g. from beaconcha.in) are 2309,4474,2408. So we are off-by-one in the target property. Most probably the situation is again that in the calculation of the target there is a big number in the numerator of a fraction (e.g. current epoch) and then the whole fraction is converted to int, so the bug only happens rarely.

What makes this situation much more sinister this time around, is that I don't see the error anymore, when I sat down to investigate, the current caplin reports are correct for this epoch.

My thinking regarding what's happening is this:

  • in caplin there are two calculation paths for these RPCs: one for recent epochs (finalized and close to finalized),
  • one for historical data,
  • my assumption is that I fixed with the PRs the historical, but not the recent data.

I have a system that continuously downloads these values to an sqlite DB, so I will know in the coming days if this issue is reoccurring and if it's always the case that the issue goes away once the epochs are antiquated. At that point, I will be able to unpatch my historical fixes and I will be able to conclude whether the buggy epochs are exactly the same between the two codepaths. If they are, then I can try to check how did my fix in #12076 doesn't apply to recent epochs, because at first look it does apply.

In the meantime, I will blog here any occurrences, so others can help out too.

Anyways, at least the issue is not critical anymore, because there is an easy fix once the epochs are antiquated: just delete them from my DB and refetch into the DB from the antiquated caplin, since then the reports are correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant