-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify WDL Toil job graphs #4524
Conversation
…ng you can do a bunch of in a row.
Looks like this failed the Giraffe WDL test with:
Maybe I managed to override the recursion limit bump code to not get called? |
@DailyDreaming Would you be able to review this before the Toil meeting tomorrow? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks well done. I like the organization and the comments are invaluable. Mostly minor comments, except I wonder if there should be a test for if something like coalesce_nodes
running IDs together that we would expect it to run together.
src/toil/wdl/wdltoil.py
Outdated
@@ -885,6 +890,11 @@ def __init__(self, **kwargs: Any) -> None: | |||
# TODO: Make sure C-level stack size is also big enough for this. | |||
sys.setrecursionlimit(10000) | |||
|
|||
# We need an ordered list of postprocessing steps to apply, because we | |||
# may ahve coalesced postprocessing steps deferred by several levels of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahve
src/toil/wdl/wdltoil.py
Outdated
@@ -885,6 +890,11 @@ def __init__(self, **kwargs: Any) -> None: | |||
# TODO: Make sure C-level stack size is also big enough for this. | |||
sys.setrecursionlimit(10000) | |||
|
|||
# We need an ordered list of postprocessing steps to apply, because we | |||
# may ahve coalesced postprocessing steps deferred by several levels of | |||
# jobs returing other jobs' promised RVs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
returing
# A conditional might only appear to depend on the variables in the | ||
# conditional expression, but its body can depend on other stuff, and | ||
# we need to make sure that that stuff has finished and updated the | ||
# environment before the conditional body runs. TODO: This is because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make a ticket for this.
""" | ||
Given a topological order of WDL workflow node IDs, produce a list of | ||
lists of IDs, still in topological order, where each list of IDs can be | ||
run under a single Toil job. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well stated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks for adding the (sub)tests.
Flatcar has failed me for the last time
This will fix #4465 by allowing multiple WDL workflow nodes to (at least sometimes) live in one Toil job.
It also removes some Toil jobs that don't really need to be separate jobs, by baking the transformations they would do into the
WDLBaseJob
.I ran a version of the conformance tests (2b91928600bfb1397c190a9900ac323db5111d1b) against this and I got:
Before:
After:
So it's maybe twice as fast and no less conformant.
Changelog Entry
To be copied to the draft changelog by merger:
Reviewer Checklist
issues/XXXX-fix-the-thing
in the Toil repo, or from an external repo.camelCase
that want to be insnake_case
.docs/running/{cliOptions,cwl,wdl}.rst
Merger Checklist