Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix job runner stdout missing newline #6081

Open
wants to merge 6 commits into
base: 8.3.x
Choose a base branch
from

Conversation

jennan
Copy link

@jennan jennan commented Apr 23, 2024

Kia ora,

While implementing a new job runner, I realised that the if the job runner submit method returns its stdout text without a newline, then a workflow with more than one task will fail. After further investigations with @hjoliver , we discovered that this is due to text handling in JobRunnerManager.jobs_submit, that doesn't properly add a newline to the text message if it is missing. This PR implements a fix.

You can reproduce the bug by removing the newline of the second output of the background job runner submit method (see https://github.com/cylc/cylc-flow/blob/master/cylc/flow/job_runner_handlers/background.py#L92) and try to run a workflow with at least 2 tasks.

I am not sure how to turn this into a test to reproduce the bug. Any guidance on this would be welcome 🙂.


[UPDATE (HO)] Short explanation: without this fix, if a job-runner submit method returns job-ID without a newline, it causes the cylc jobs-submit stdout to be misformatted (two lines concatenated) such that the scheduler can't parse it correctly - for one out of the multiple jobs submitted.


Check List

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • No dependency changes
  • Tests are included (or explain why tests are not needed).
  • Changelog fragment included if this is a change that can affect users
  • No cylc-doc PR needed
  • If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

@jennan jennan changed the base branch from 8.2.x to master April 23, 2024 23:30
@hjoliver
Copy link
Member

hjoliver commented Apr 24, 2024

Thanks @jennan !

@oliver-sanders (UK team) noted overnight that he ran into this some time ago, but wasn't sure it was a bug so added a note on the need for newline in the handler API documentation: https://github.com/cylc/cylc-flow/blob/master/cylc/flow/job_runner_handlers/documentation.py#L426-L428

However, now that you've tracked down the offending code, by inspection (and also by comparison with similar in jobs_kill(...), it is definitely a bug.

Most of the current job-runners don't provide an explicit submit method, they just use the command template as a shortcut. However, the background handler does, and it deliberately adds a newline to its return value. https://github.com/cylc/cylc-flow/blob/master/cylc/flow/job_runner_handlers/background.py#L92

@jennan - can you tweak this PR to:

  • not add the newline in the background handler (above), as it's not necessary now
  • and remove the newline advice in the docs (above)

@hjoliver
Copy link
Member

@jennan - no cylc-doc PR is needed for this. It could do with a change log entry though, just in case anyone else is working on job runners at the moment (unlikely but not impossible!). Just add a file changes.d/6081.fix.md with a short description (see other md files in there for example).

@MetRonnie MetRonnie added this to the 8.3.1 milestone Apr 30, 2024
@MetRonnie MetRonnie changed the base branch from master to 8.3.x June 19, 2024 12:34
@oliver-sanders oliver-sanders modified the milestones: 8.3.1, 8.3.x Jun 27, 2024
@hjoliver hjoliver self-assigned this Oct 23, 2024
@hjoliver hjoliver added the bug Something is wrong :( label Oct 23, 2024
jennan and others added 5 commits October 23, 2024 14:40
When the stdout returned by the job runner submit method doesn't end with
a newline, the `sys.stdout.write` call in `jobs_submit` do not add it and
this cases submission to fails (at least to be recorded as failed by the
calling process when running a workflow).

Looking at how newlines are handled in the JobRunnerManager.job_kill method,
it looks like this a typo in the jobs_submit method.
@@ -89,7 +89,7 @@ def submit(cls, job_file_path, submit_opts):
exc.filename = "nohup"
return (1, None, str(exc))
else:
return (0, "%d\n" % (proc.pid), None)
return (0, str(proc.pid), None)
Copy link
Member

@hjoliver hjoliver Oct 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This newline was compensating for the bug, pre-fix).

@@ -424,8 +424,7 @@ def submit(
ret_code:
Subprocess return code.
out:
Subprocess standard output, note this should be newline
terminated.
Subprocess standard output.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(We now add the newline automatically)

@hjoliver
Copy link
Member

@oliver-sanders - I've finished this off, and simplified the affected code a bit.

Not sure how to go about testing it.

@hjoliver hjoliver marked this pull request as ready for review October 23, 2024 01:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is wrong :(
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants