Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support and handle optional "final" flag in .free RPC #1266

Merged
merged 5 commits into from
Aug 13, 2024

Conversation

milroy
Copy link
Member

@milroy milroy commented Aug 10, 2024

Recently, a large job on a large system was considered allocated by Fluxion, but was complete and released in flux-core (flux-framework/flux-core#6179). The proposed solution was to amend RFC 27 to include an optional "final" boolean flag in the .free RPC. That flag can be used by Fluxion to determine if there is an allocation state discrepancy between flux-core and sched and take action based on that information.

This PR adds support to handle the optional "final" flag in the .free RPC. The current implementation only logs an error when a state discrepancy is detected, but doing a full cancellation of the job would be a straightforward addition to this PR. After further thought, executing a full cancel when a discrepancy exists between flux-core and -sched seems like a better approach since this state may go unnoticed by administrators.

@milroy
Copy link
Member Author

milroy commented Aug 10, 2024

@trws, this PR is ready for review. There are still two issues I'm trying to fix:

  1. See the note about .get() hanging for the .free RPC in the testsuite test
  2. clang-format is changing one or more files in this PR, and I can't reproduce it in the container in my dev environment

@milroy milroy requested a review from trws August 10, 2024 22:11
@milroy
Copy link
Member Author

milroy commented Aug 10, 2024

@garlick or @grondo, it may be good for one of you to take a look at this, too.

Copy link
Contributor

@grondo grondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just adding a few suggestions to the test, since I 'm not qualified to fully review the other parts of the PR.

t/issues/t6179-flux-core-housekeeping.sh Outdated Show resolved Hide resolved
t/issues/t6179-flux-core-housekeeping.sh Outdated Show resolved Hide resolved
t/issues/t6179-flux-core-housekeeping.sh Outdated Show resolved Hide resolved
t/issues/t6179-flux-core-housekeeping.sh Show resolved Hide resolved
@trws
Copy link
Member

trws commented Aug 12, 2024

2. clang-format is changing one or more files in this PR, and I can't reproduce it in the container in my dev environment

For this one, two things:

  1. if you aren't, try using pre-commit since it will fetch the exact same clang-format version we use in CI for you
  2. Take a look at pre-commit: show diff on failure #1267 and maybe rebase on there, I didn't realize we wouldn't get the format change on failure, and we definitely should so that adds it.

Copy link
Member

@trws trws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple bits to think about in the control flow, but this is a great improvement. I think the .get() issue is just the lack of response as @grondo mentioned, so that should be fine.

qmanager/policies/base/queue_policy_base.hpp Outdated Show resolved Hide resolved
qmanager/policies/base/queue_policy_base.hpp Show resolved Hide resolved
qmanager/policies/base/queue_policy_base.hpp Outdated Show resolved Hide resolved
qmanager/policies/base/queue_policy_base.hpp Outdated Show resolved Hide resolved
static_cast<intmax_t> (id));
errno = EPROTO;
goto out;
}
}
set_schedulability (true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line now needs to move up, maybe even to the start of this case? We should run the sched loop after any clearing of resources, even if something went slightly wrong, otherwise we might block for lack of incoming events.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about moving set_schedulability (true); right before the return?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would work too, as long as we're doing it in every case where something actually changes. If we can avoid calling it when nothing changes, that keeps resource usage down, but setting it too often should still execute correctly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guideline of setting schedulability only if the resource state changes does suggest movingset_schedulability (true); is the best approach with the understanding that cancel isn't atomic.

@milroy milroy force-pushed the final-free branch 2 times, most recently from 5a10e53 to 04d59ec Compare August 12, 2024 04:15
@milroy
Copy link
Member Author

milroy commented Aug 12, 2024

Take a look at #1267 and maybe rebase on there, I didn't realize we wouldn't get the format change on failure, and we definitely should so that adds it.

Yeah, that PR identified the formatting issue. Thanks!

@milroy
Copy link
Member Author

milroy commented Aug 13, 2024

@trws and @grondo this PR is ready for another look.

Copy link
Member

@trws trws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm happy with this pending the move of the set_schedulability call so we still start a loop on a successful partial release. Good stuff @milroy!

Problem: recently, a large job on a large system was considered
allocated by Fluxion, but was complete and released in flux-core
(flux-framework/flux-core#6179). The proposed
solution was to amend RFC 27 to include an optional "final" boolean
flag in the `.free` RPC. That flag can be used by Fluxion to determine
if there is an allocation state discrepancy between flux-core and
sched.

Add support to unpack the "final" boolean and send it to the qmanager
policy for handling.
Problem: if the "final" flag from the `.free` RPC disagrees with the
`full_removal` flag returned by partial cancel there is a discrepancy
between flux-core and Fluxion.

Run a full cancel if there is a discrepancy between flux-core and -sched
allocation state and log errors.
Problem: while running tests for this PR, a full cancellation failed but
did not output the traverser error.

Add logging to output traverser errors during cancellation.
Problem: if a partial cancellation does not fully remove an allocation
but the "final" flag is set by the .free RPC a full cancellation is now
run. However, the current traverser check for a missing exclusive span
(x_span) considers this invalid for a full cancellation and returns an
error.

Update the check to only return an error for this condition if the
cancellation is of type VTX_CANCEL.
Problem: there is no test for flux-core issue
flux-framework/flux-core#6179.

Add tests in the Fluxion issues directory.
Copy link

codecov bot commented Aug 13, 2024

Codecov Report

Attention: Patch coverage is 70.37037% with 8 lines in your changes missing coverage. Please review.

Project coverage is 75.5%. Comparing base (fdf1038) to head (507d368).
Report is 1 commits behind head on master.

Files Patch % Lines
qmanager/policies/base/queue_policy_base.hpp 64.7% 6 Missing ⚠️
resource/modules/resource_match.cpp 0.0% 2 Missing ⚠️
Additional details and impacted files
@@          Coverage Diff           @@
##           master   #1266   +/-   ##
======================================
  Coverage    75.5%   75.5%           
======================================
  Files         111     111           
  Lines       15331   15350   +19     
======================================
+ Hits        11587   11604   +17     
- Misses       3744    3746    +2     
Files Coverage Δ
qmanager/modules/qmanager_callbacks.cpp 74.0% <100.0%> (+2.0%) ⬆️
qmanager/policies/queue_policy_bf_base_impl.hpp 81.3% <100.0%> (+0.3%) ⬆️
qmanager/policies/queue_policy_fcfs.hpp 100.0% <ø> (ø)
qmanager/policies/queue_policy_fcfs_impl.hpp 73.5% <100.0%> (+0.6%) ⬆️
resource/traversers/dfu_impl_update.cpp 76.2% <100.0%> (ø)
resource/modules/resource_match.cpp 68.7% <0.0%> (-0.1%) ⬇️
qmanager/policies/base/queue_policy_base.hpp 78.9% <64.7%> (-0.8%) ⬇️

@milroy
Copy link
Member Author

milroy commented Aug 13, 2024

Thanks for the feedback @trws and @grondo! Setting MWP.

@milroy milroy added the merge-when-passing mergify.io - merge PR automatically once CI passes label Aug 13, 2024
@mergify mergify bot merged commit ed3572e into flux-framework:master Aug 13, 2024
19 of 20 checks passed
@milroy milroy deleted the final-free branch August 13, 2024 18:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-when-passing mergify.io - merge PR automatically once CI passes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants