-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Round restart #288
Round restart #288
Conversation
- also make verifier more robust, upon failures it now resets its tasks
…n coordinator_state module
…y weren't being used consistently and were only used for testing, shouldn't be part of the public API. Removed ban DropParticipantData field
@kellpossible are we separating the unfinished items into new issues? |
…round during drop
@apruden2008 I think so yes, I'll do that today, also I will properly report the problems that I encountered while testing the manual round restart in new issues |
/// An enum containing a [Locator] or [LocatorPath]. | ||
/// | ||
/// **Note:** This can probably be refactored out in the future so | ||
/// that we only use [Locator]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will open an issue for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Pending a second review, and #308 hopefully this PR is good to go |
.map(|(bucket_index, (participant, mut participant_info))| { | ||
let bucket_id = bucket_index as u64; | ||
let tasks = initialize_tasks(bucket_id, number_of_chunks, number_of_contributors as u64)?; | ||
participant_info.restart_tasks(tasks, time)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like one of the key parts of the change, is it? To reset the state of all contributors who is still in the round to the initial state similar to what we have on round start.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep that's right
})); | ||
// Restart the round because there are no verifiers | ||
// left for this round. | ||
let reset_round = self.current_verifiers.is_empty() && self.finished_verifiers.is_empty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it enough to have self.current_verifiers.is_empty()
? Can you tell us more about the finished verifiers check here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The verifiers in the round get moved from the current_verifiers
list to the finished_verifiers
list when all their tasks are complete. If only checking current_verifiers
and all the verifiers are considered complete for whatever reason, then the round will still be reset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at least that's how I was understanding it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this case possible: one verifier finished, another one dropped?. This check will be produce false therefore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for raising questions about this line, I think it deserves more consideration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I referenced it in #305
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it will produce false. Under the assumption that the remaining verifiers would be able to pick up the slack, but perhaps this isn't working yet and will be fixed in #305
@@ -783,13 +794,38 @@ impl Round { | |||
} | |||
} | |||
|
|||
/// Remove a contributor from the round. | |||
pub(crate) fn remove_contributor_unsafe( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope we will remove "unsafe" and "try" from the function names one day. In case of remove_something_unsafe
it's misleading because it's a safe function. And try_do_something() -> Result<...>
doesn't need to have "try_" because it returns Result
and it's clear that it can fail. And here the name is justified by the similarity to other remove_*_unsafe
functions, which is ok for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I was just following convention in this case. I'm not sure how we would express the intended meaning. The function may not fail with a result, but my guess is the original authors were trying to say that calling this function might be dangerous in the wrong contexts (without being "unsafe" according to the strict Rust definition).
Edit: updated after re-reading your comment
Currently a work in progress:
TODO:
Check what happens when a verifier drops (do the remaining verifiers pick up the dropped verifier's tasks successfully?).We can do this in Handle accepted task handoff for verifiers #305Handle situation where there are no verifiers left in the current round. Or at least document this in an issue.We can do this in Handle accepted task handoff for verifiers #305Handle situation where there are no replacement contributors left(or at least document it for future implementation of the replacement contributor logic). Test the situation where contributor is dropped, and all replacement contributors are already occupied #309A number of refactors in this PR are related to #267