Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IOop::Barrier #1494

Merged
merged 2 commits into from
Oct 14, 2024
Merged

Add IOop::Barrier #1494

merged 2 commits into from
Oct 14, 2024

Conversation

mkeeter
Copy link
Contributor

@mkeeter mkeeter commented Oct 7, 2024

(staged on top of #1493)

The Barrier op does no work, but serves as a full dependency barrier (like a flush) so that the Downstairs can reset its list of completed jobs. Note that it does not update last_flush or retire jobs in the Upstairs, because it doesn't actually persist data to disk.

This is a building block for removing automatic flushes (see RFD 518).

@faithanalog
Copy link
Contributor

faithanalog commented Oct 10, 2024

It looks like this just adds the IOop, but the operation won't be submitted until a future PR, and doesn't change replay behavior currently, right? Seems fine to me, as long as that CI failure was a flake and not something real

Copy link
Contributor

@faithanalog faithanalog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i need to fill this text field to make octocat happy

Base automatically changed from mkeeter/simplify-complete-jobs to main October 11, 2024 14:06
@mkeeter
Copy link
Contributor Author

mkeeter commented Oct 11, 2024

It looks like this just adds the IOop, but the operation won't be submitted until a future PR, and doesn't change replay behavior currently, right? Seems fine to me, as long as that CI failure was a flake and not something real

Yup, this is just adding the IOop, but doesn't use it (yet)

downstairs/src/lib.rs Show resolved Hide resolved
assert!(read_data.data.is_empty());
assert!(extent_info.is_none());

if jobs_completed_ok == 2 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the barrier, do we want jobs_completed_ok == 3?

And, we are not acking this to anyone, but do you plan to use ackable later to indicate that future action can move forward?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ambivalent! Like you said, we're not actually acking this to anyone; I also don't plan to use presence in ackable_work as an indication for anything here (unlike how it's used in live-repair).

This means the only thing that happens when the job is acked is that it moves from GuestWork::active to GuestWork::completed. Setting the ack to happen at jobs_completed_ok == 3 seems reasonable to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are not using ackable for anything, then I would vote for not setting it at all.
Otherwise I'll come back here and see it set and wonder what was consuming that and be confused when I found nothing.

I do care about the dtrace probe though, that one we might actually want to know when this operation has finised on all three downstairs. I don't think I would ever care to know that the barrier had finished on 2/3 downstairs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do need to use ackable, because the job is stored in GuestWork so we need to "ack" it to remove it from that map (even though nothing else happens in the ack).

I changed the threshold to 3 in ed52bf5

@leftwo
Copy link
Contributor

leftwo commented Oct 11, 2024

Ignore the failure in test-up-2region-encrypted. I'm working on a fix and I have another PR out to disable it till I get it fixed.

@mkeeter mkeeter merged commit bdfa096 into main Oct 14, 2024
16 of 17 checks passed
@mkeeter mkeeter deleted the mkeeter/barrier-op branch October 14, 2024 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants