Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upgrades: make sure that relaying works across upgrades #3577

Closed
Tracked by #1804 ...
erwanor opened this issue Jan 4, 2024 · 7 comments
Closed
Tracked by #1804 ...

upgrades: make sure that relaying works across upgrades #3577

erwanor opened this issue Jan 4, 2024 · 7 comments
Assignees
Labels
A-IBC Area: IBC integration with Penumbra A-upgrades Area: Relates to chain upgrades _P-high High priority _P-V1 Priority: slated for V1 release
Milestone

Comments

@erwanor
Copy link
Member

erwanor commented Jan 4, 2024

Is your feature request related to a problem? Please describe.
We should test that ordinary chain state upgrades are not light-client breaking. This should be tested by pointing a relayer to a chain that undergoes a simple upgrade and make sure that there are no errors across the upgrade boundary.

Context: Upgrades that are light-client breaking should be rare, and require a special IBC mechanism (#3356). We want to make sure that we haven't missed something that would make ICS07 clients incapable of verifying our chain state post-upgrade.

@erwanor erwanor added P-blocked _P-V1 Priority: slated for V1 release labels Jan 4, 2024
@erwanor erwanor self-assigned this Jan 4, 2024
@erwanor erwanor added this to Testnets Jan 4, 2024
@aubrika aubrika added this to Penumbra Feb 8, 2024
@github-project-automation github-project-automation bot moved this to 🗄️ Backlog in Penumbra Feb 8, 2024
@aubrika aubrika removed this from Testnets Feb 8, 2024
@hdevalence hdevalence moved this from 🗄️ Backlog to Blocked in Penumbra Feb 8, 2024
@hdevalence hdevalence added _P-medium Medium priority and removed P-blocked labels Feb 8, 2024
@aubrika aubrika added this to the Sprint 2 milestone Mar 6, 2024
@cratelyn cratelyn added the A-upgrades Area: Relates to chain upgrades label Mar 12, 2024
@cratelyn cratelyn modified the milestones: Sprint 2, Sprint 3 Mar 18, 2024
@aubrika aubrika added _P-high High priority A-IBC Area: IBC integration with Penumbra and removed _P-medium Medium priority labels Mar 25, 2024
@aubrika aubrika modified the milestones: Sprint 2, Sprint 3 Mar 25, 2024
@conorsch
Copy link
Contributor

See #4087: we plan to exercise a chain upgrade Soon™. As long as we were careful to ensure that Hermes is running on the fresh chain pre-upgrade, we'll get clarity on whether we already have upgrade support for IBC by testing transfers after the upgrade, without making new channels.

@aubrika aubrika moved this from Blocked to Todo in Penumbra Mar 25, 2024
@conorsch conorsch modified the milestones: Sprint 3, v1 Mar 25, 2024
@aubrika aubrika added this to the Sprint 3 milestone Mar 25, 2024
@erwanor erwanor moved this from Todo to In progress in Penumbra Apr 1, 2024
@erwanor
Copy link
Member Author

erwanor commented Apr 1, 2024

Great news, last friday, @avahowell and I paired on this. After a couple tries, we were able to perform a state breaking upgrade and relay UpdateClient messages across the migration boundary. This is an important milestone because it establish that our "ordinary" upgrade path is not IBC breaking (like we conjectured).

Hermes is able to relay across upgrades using a genesis-restart mode that requires running two full nodes: one for the legacy chain, and another for the upgraded chain: the procedure is outlined in this document and implemented via an internal load balancer for lightblock queries via AnyIo.

I will close this ticket in favor of two new ones:

  • pd: implement a "stasis" mode to only serve RPCs
  • upgrades: explore folding a lightblock window into the app state

The first one is about adding the ability to pause block production while keeping both pd and comet running (to serve RPCs). The second one is about brainstorming ways to reduce the operational overhead on relayers and make our upgrade process leaner.

@erwanor erwanor closed this as completed Apr 1, 2024
@github-project-automation github-project-automation bot moved this from In progress to Done in Penumbra Apr 1, 2024
@hdevalence
Copy link
Member

pd: implement a "stasis" mode to only serve RPCs

This is already the case, no? pd will happily sit there and serve RPC requests without block production, we don't need to do anything?

upgrades: explore folding a lightblock window into the app state

I don't think we should do this, the cost/benefit of adding this additional scope is unclear. I think we can be comfortable with the status quo for other cosmos chains.

@avahowell
Copy link
Contributor

This is already the case, no? pd will happily sit there and serve RPC requests without block production, we don't need to do anything?

Hermes needs the comet RPC for getting raw block headers, I don't think there's a way currently to run both comet and pd without producing blocks

@erwanor
Copy link
Member Author

erwanor commented Apr 1, 2024

I think this can be done completely via comet config options, or so I hope. I tagged it as a pd matter as a loose category. Ideally, we would have pointers to safe config options that completely disable things like peer exchange, mempool, and producing/gossiping new blocks, but keeping the RPC and maybe the Info requests.

@hdevalence
Copy link
Member

Ideally, we would have pointers to safe config options that completely disable things like peer exchange, mempool, and producing/gossiping new blocks,

But in the event of an upgrade, we know there won't be new blocks, or else the chain has faulted, so this seems like low benefit vs cost to configure/support.

@erwanor
Copy link
Member Author

erwanor commented Apr 2, 2024

That's a good point, it wasn't the case in our testbed setup where we were effectively running a contentious fork to test this. Another reason to do this was to minimize network layer bandwidth churn, because for some reason, the p2p shell in comet does not penalize peers that repeatedly broadcast invalid blocks. But this hasn't been pinned to be an actual problem yet, so I agree we should default to not do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-IBC Area: IBC integration with Penumbra A-upgrades Area: Relates to chain upgrades _P-high High priority _P-V1 Priority: slated for V1 release
Projects
Archived in project
Development

No branches or pull requests

6 participants