Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sending to many peers can be tolerant of failures #160

Merged
merged 2 commits into from
Aug 30, 2024

Conversation

Maelkum
Copy link
Collaborator

@Maelkum Maelkum commented Jul 22, 2024

This PR makes a slight change to how sendToMany() operates.

Before, we were strict in how we treat errors - failing to send message even to a single peer meant we treated the whole operation as a failure. Also, if send to n-th peer failed - we did not even try peers that are after it in the list.

This caused problems in cases where we might have flaky peers that do reply to a roll call but might drop off afterwards, pulling down the whole operation with it. Additionally, we were sending messages sequentially.

Now, we parallelize sends, and also tolerate partial send failures, similar to how we treat waiting for execution responses - we might not get everyone. This goes for non-consensus executions at the moment. For consensus we still require all peers get the message.

Also - if ALL sends fail - we do treat that as an error.

@Maelkum Maelkum requested a review from dmikey July 22, 2024 13:38
@Maelkum Maelkum self-assigned this Jul 22, 2024
@dmikey dmikey merged commit b1a1ab4 into main Aug 30, 2024
5 checks passed
@dmikey dmikey deleted the send-to-many-best-effort branch August 30, 2024 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants