Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix master kill #2366

Merged
merged 8 commits into from
Jun 25, 2024
Merged

Fix master kill #2366

merged 8 commits into from
Jun 25, 2024

Conversation

mesmith75
Copy link
Contributor

Fixes #2358

Copy link
Member

@egede egede left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to the problem of killing a master job that has some failed subjobs? If yes, I do not see how?

@mesmith75
Copy link
Contributor Author

This is looping over the subjobs. When it gets to the last one, then it also updates the master job status. There is another updateStatus two lines later for the master job which tries to update the already updated master status and gives the error. It is better to do this than remove the second one, as the second one is also used for jobs without subjobs.

egede
egede previously approved these changes Jun 20, 2024
@mesmith75
Copy link
Contributor Author

You are right though, I'll check again to be sure it is doing exactly what I think it should be

@mesmith75
Copy link
Contributor Author

mesmith75 commented Jun 20, 2024

You are right - this wasn't the real issue. The problem is that the master job ends up in failed once the last subjob is killed but then the following master job update tries to put it into killed, causing the issue.

I am not sure what is best - have the master job switch to killed or stay at failed.

@mesmith75 mesmith75 dismissed stale reviews from alexanderrichards and egede via cd18860 June 24, 2024 11:01
@mesmith75
Copy link
Contributor Author

This is slightly less elegant, but more precise for what we want

@mesmith75 mesmith75 merged commit ee9290d into develop Jun 25, 2024
10 checks passed
@mesmith75 mesmith75 deleted the kill_fix branch June 25, 2024 08:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Job master kill error with failed subjobs
3 participants