You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The reason for these errors is that BEEFY finality had stalled (gossip messages were likely lost during a prior reset) and tranches of rococo validators were reverted to a state snapshot before 17h00 UTC. When restoring from the snapshot, the validators resumed voting, but only voted on mandatory blocks to catch up with grandpa finality, hence only produced finality proofs for the mandatory blocks. Ergo: the relayers would only relay commitments to mandatory blocks, which all have the off-by-one bug described in polkadot-fellows/runtimes#160.
The above off-by-one issue had been masked so far since under normal operating circumstances, validators don't only vote on the mandatory blocks, but also on blocks in-between these (if BEEFY finality is not too far behind GRANDPA finality), hence it was safe for the bridge to skip the mandatory blocks that live on session boundaries:
New BEEFY rounds will commence prior to next_session_start only iff best_beefy + NEXT_POWER_OF_TWO((best_grandpa - best_beefy + 1) / 2)) < next_session_start, see https://spec.polkadot.network/sect-finality#defn-beefy-round-number).
But given the large lag of BEEFY finality vs GRANDPA at the time, this condition was never satisfied, hence only voting on mandatory blocks was performed and no new BEEFY rounds were started in session 15088 (corresponding to auth_set15077).
The long-term fix is addressing the off-by-one error (polkadot-fellows/runtimes#160), but this will still leave the current deployment bricked since it will only reflect in a future runtime.
If BEEFY finality hadn't progressed yet, we could also have deployed a modified client that would initiate voting rounds on an offset from the current definition (https://spec.polkadot.network/sect-finality#defn-beefy-round-number), but now this would regress BEEFY finality and would only have been a temporary solution if we revert paritytech/polkadot#6577.
As such, to address the stalled deployment, our current tally of options is the following:
reset the bridge, i.e. redeploy contracts on Sepolia with an initial block higher than the gap without non-mandatory beefy finalizations, reset nonces in the BridgeHub runtime, and clear assets registered on AssetHub
manually aggregate (i.e. outside the voting protocol) a signed commitment for any block in session 15087 (other than the mandatory block already finalized), viz à viz any later sessions that only had commitments to the mandatory blocks.
The text was updated successfully, but these errors were encountered:
The Rococo⇄Sepolia bridge is currently stalled since it's not receiving any new commitments from relayers.
https://sepolia.etherscan.io//address/0xe6e799ebb05ac563f36037f9538d13c4e2649f8b
The last successful
submitFinal
call was https://sepolia.etherscan.io/tx/0xc8dcf52dcddfd3157e6d49ff2f43f4daa860a3a10becda36f6877dfabdaeda1e on Jan 19th containing a commitment to block 8810076.However, thereafter Snowfork's relayers went down with the following commitment processing errors:
https://github.com/snowfork/snowbridge/blob/13db09317fad428af4e2bb8faf590cb3f17ad97c/relayer/relays/beefy/polkadot-listener.go#L85-L91
The reason for these errors is that BEEFY finality had stalled (gossip messages were likely lost during a prior reset) and tranches of rococo validators were reverted to a state snapshot before 17h00 UTC. When restoring from the snapshot, the validators resumed voting, but only voted on mandatory blocks to catch up with grandpa finality, hence only produced finality proofs for the mandatory blocks. Ergo: the relayers would only relay commitments to mandatory blocks, which all have the off-by-one bug described in polkadot-fellows/runtimes#160.
Details on the issue's cause
(see also https://hackmd.io/w48qUMd8TUiYvFxH9Vtcgg)
The structure of a
Commitment
relayed to Ethereum is the followingwhile the current structure of an
MMRLeaf
(also see polkadot-fellows/runtimes#160, paritytech/substrate#11797, paritytech/polkadot#6577) isFor the mandatory block 8810076, the payload for
Commitment
andMMRLeaf
was thussince, given that
<N-1>
was still in the prior session,next_auth_set_of<N-1>
referred to the current auth set15087
.On the relayer, this payload then fails on the check that
auth_set == next_auth_set - 1
since they are currently in fact equal on mandatory blocks:https://github.com/snowfork/snowbridge/blob/13db09317fad428af4e2bb8faf590cb3f17ad97c/relayer/relays/beefy/polkadot-listener.go#L113
Had the relayer sent it to the bridge, it would still have failed on
https://github.com/snowfork/snowbridge/blob/13db09317fad428af4e2bb8faf590cb3f17ad97c/contracts/src/BeefyClient.sol#L362-L364
Hence, the Ethereum bridge never handled any mandatory blocks.
The above off-by-one issue had been masked so far since under normal operating circumstances, validators don't only vote on the mandatory blocks, but also on blocks in-between these (if BEEFY finality is not too far behind GRANDPA finality), hence it was safe for the bridge to skip the mandatory blocks that live on session boundaries:
New BEEFY rounds will commence prior to
next_session_start
only iffbest_beefy + NEXT_POWER_OF_TWO((best_grandpa - best_beefy + 1) / 2)) < next_session_start
, see https://spec.polkadot.network/sect-finality#defn-beefy-round-number).But given the large lag of BEEFY finality vs GRANDPA at the time, this condition was never satisfied, hence only voting on mandatory blocks was performed and no new BEEFY rounds were started in session
15088
(corresponding toauth_set
15077
).Consequently, the earliest block voted subsequent to
8810108
is8810707
, which is already session15089
(corresponding toauth_set 15088
), hence the relayer threw above error3.
(and the solidity contract would else have errored on https://github.com/snowfork/snowbridge/blob/13db09317fad428af4e2bb8faf590cb3f17ad97c/contracts/src/BeefyClient.sol#L353-L355 sinceauth_set 15088
exceeds both itscurrentValidatorSet
andnextValidatorSet
)Solution
The long-term fix is addressing the off-by-one error (polkadot-fellows/runtimes#160), but this will still leave the current deployment bricked since it will only reflect in a future runtime.
If BEEFY finality hadn't progressed yet, we could also have deployed a modified client that would initiate voting rounds on an offset from the current definition (https://spec.polkadot.network/sect-finality#defn-beefy-round-number), but now this would regress BEEFY finality and would only have been a temporary solution if we revert paritytech/polkadot#6577.
As such, to address the stalled deployment, our current tally of options is the following:
15087
(other than the mandatory block already finalized), viz à viz any later sessions that only had commitments to the mandatory blocks.The text was updated successfully, but these errors were encountered: