Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(sequencer): implement retry logic for connecting to the oracle sidecar #1788

Merged
merged 117 commits into from
Nov 7, 2024

Conversation

Fraser999
Copy link
Contributor

Summary

This changes sequencer initialization to be blocked while attempting to connect to the oracle sidecar (if configured to use the sidecar), and to fail if repeated connection attempts fail.

Background

This was a todo in the code.

Changes

  • Added blocking retry logic around creating a connected oracle client.

Testing

Added a unit test to confirm the process waits for the expected duration. We could probably do with adding gRPC mock tests for the sidecar, but I think that's outside the scope of this PR.

Changelogs

No updates required as this is (effectively) a feature branch which will have all relevant changelogs updated as one of the final steps of completing the feature.

Related Issues

Closes #1787

@Fraser999 Fraser999 requested a review from a team as a code owner November 5, 2024 15:28
@github-actions github-actions bot added the sequencer pertaining to the astria-sequencer crate label Nov 5, 2024
@Fraser999 Fraser999 requested review from noot and removed request for SuperFluffy November 5, 2024 16:01
Base automatically changed from noot/slinky to feat/oracle November 6, 2024 00:15
@noot noot requested review from a team and joroshiba as code owners November 6, 2024 00:15
@github-actions github-actions bot added proto pertaining to the Astria Protobuf spec cd labels Nov 6, 2024
Comment on lines +346 to +349
let _ = oracle_client
.prices(QueryPricesRequest::default())
.await
.wrap_err("oracle sidecar responded with error to query for prices");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should move this out of the retried function and call it after we successfully connect to the client, as if the grpc call for prices fails, it indicates a more serious error that probably isn't fixable upon retry. thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is maybe a general problem with most or even all of our retry loops like this. We probably have some error cases which are permanent/unrecoverable, and others which are transient. However we always treat them as transient and keep retrying.

In this specific case, I guess we could also get a transient failure of the gRPC (e.g. timeout or dropped connection)? We should probably look to abandon the retry loop if the error indicates a permanent failure, but keep trying if not. However, that's another can of worms, since matching on error variants isn't a lot of fun when using eyre/anyhow.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, maybe we can do a larger revisit of the retry functions. this is fine for now, but something to keep in mind

@noot noot merged commit 1b162c2 into feat/oracle Nov 7, 2024
45 checks passed
@noot noot deleted the fraser/ENG-996/retry-connect branch November 7, 2024 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cd proto pertaining to the Astria Protobuf spec sequencer pertaining to the astria-sequencer crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants