Introduction of user defined post steps #166

FrankD412 · 2018-12-14T05:46:34Z

This PR relates to issue #98 -- the addition of pre and post steps. Some initial exploration has revealed that the addition of post steps will be a non-trivial effort so I'm making this PR the base PR for other branches that will iteratively refactor and add the necessary functionality to build this up. So far here's what's holding up the feature:

The ScriptAdapter and SchedulerScriptAdapter abstract interfaces are cumbersome. The get_scheduler_command method, for example, takes in a step as its only argument and returns a Boolean with a path to the script that executes the command and a path to the restart script (if the step specifies a restart). There is no general way to simply generate a script currently other than refactoring or adding more returns to the get_scheduler_command method.
Script generation is too coupled with higher level information. The LocalScriptAdapter has the same underlying structure as any other SchedulerScriptAdapter and because of this requires things like a header. A little more generalization is needed so that an adapter's functionality should get more specific information passed to it instead of a whole step.
The SchedulerScriptAdapter really should be split up into a batch interface and a script interface and would help with the first bullet. The division of responsibility would open up the ability to generate simple scripts without the need of extra information and could be used to generate scripts as needed without having monolithic generation methods.

The split of responsibilities is already being started in a new branch post_steps/adapter_refactor, which will lead to tweaks to all the adapters and then the addition of general script generation to support post steps.

* Additional arguments can be passed through batch. * Correction of the use of extend to append. * Removal of OMPI vars from env. * Reversal of env altering. * Addition of mpi exe to batch. * Removal of -gpu flag * Addition of "allocated" to PENDING set. * Correction of the check_status method call. * Correction of the Flux import * Addition of the EnvironmentError to try/catch for check_status. * Addition of jobid to submission INFO logging.

* Fixed a workflow setup issue caused by spaces in the spec name * Additional dirname formatting

* Minor docstring correction. * Change to study setup API to break out workspace creation. * Update Maestro frontend to use new Study method. * Renamed Study.setup to be better communicate its functionality. * Correction to style for configure_study method. * Moved environment application to add_step. * Split out acquiring environment elements to its own method. * Updates to staging checks. * Renamed _setup_linear and _setup_parameterized from setup to stage for clarity.

…ing the repo (#121)

* Addition of method and tweaks to support metadata. * Addition of call to the method for generating metadata. * Correction to encode YAML string for py3 compatibility. * Another attempt at py2-py3 compatible encoding. * Addition of load_metadata method.

* Additional logging for debugging of the adapter. * Additional logging.

* #107 Added StudyStatus enumeration Changed DAG execute_ready_steps to return StudyStatus enum instead of a boolean conductor and maestro also accept StudyStatus and convert to return value * #128 fixed issues found in pull request * #107 added inline comments per PC * #107 pulling out _check_study_completion to avoid initial implementaion copy and paste error * #107 fixed line length issue

…ed. (#131)

* Command line parameter for custom ParameterGenerators. * Addition of sample custom parameter generation. * Correction to missing parameters in SIZE. * Addition of utility method for importing custom gen py files. * Correction of syntax. * Conversion of DOS newlines to Unix. * Addition of call to check for custom code. * Correction of a bad docstring. * Addition of parameter metadata to Maestro. * Addition of parameters to study metadata. * Addition of copying of the parameter file to study workspace.

…cified. (#133) * Initial fix for steps that cannot restart. * Comments to clarify logic.

…on. (#137) * Tweaks to how the ExecutionGraph adds descriptions. Added logging. * Updated study to use the new API and some minor refactor. * Fixed #134

Fixes #123

* Updates to the README. * Added initial getting started example documentation * Addition of Quick Start documentation. (#130) * Addition of Quick Start. * Addition of cancellation status example. * Corrections to quick start and fix for build errors. * Addition of a LULESH study specification with heavier comments. (#132) * Start of a LULESH spec that's commented. * Addition of explanation of step structure. * Additional comments about steps and dependencies. * Minor tweaks to comments. * Addition of parameter block comments. * Addition of reference to launcher_token YAML * Tweaks and edits based on feedback.

…136) * Moved 'mark_submitted' into ExecutionGraph 'execute' * Conversion of ready_steps deque to use step names. * Capping the available slots to the length of ready_steps.

* Addition of a shell util to centralize process creation settings. * Switch to start_process method. * Addition of docstring for start_process. * Update to fix universal newline from false to true. * Addition of a check for a list and a hard set of shell to False. * Made conductor launch command a string. * Fix behavior of start_process for localscriptadapter. The shell=False flag was not being passed, nor were the env or cwd arguments being used in the actual function. * Clean up of start_process

* Addition of a ping method. * Tweaks to error logging. * Moved specific existence exception out of return code check. * Cleanup of some merge characters. * Tweak for connectivity checking. * Addition of info to the error message. * Removal of git return codes since they don't mean anything.

* Addition of hashing to Study parameterization. * Addition of the hashws option to argparse. * Addition of a warning note for users who use labels in steps.

* Addition of a more general flux ScriptAdapter. * Addition of some casting from int to str * Corrected "gpus" to "ngpus" * Rework jobspec construction to make a valid jobspec. * Check for empty value for cores per task.

Signed-off-by: Peter Robinson <[email protected]>

* Changes to make workspaces reflect relative pathing based on step names. * Addition of an alternative output format based on step combinations.

Fixes #167 * added pytest to requirements added Pipfile and pipenv settings * Added property key to Abstract.ScriptAdapter (#167) Also added impementation and tests to verify that existing functionality isn't changed * updated factory to use key when registering adapters(#167) * cleanedup linelength * cleaned up imports to be specific to module (#167) * added tests to verify exception for unknown adapter * moved adapters tests to individual files * added test to verify scriptadapter functionality (#167) updated gitignore to have testing and pycharm ignores testing existing adapters in factory (#167) added test to verify factories.keys matches get_valid_adapters (#167) added copyright to file * updated __init__ modules to do dynamic includes * removed unneeeded imports * updated dependency versions * fixed all flake8 errors * updated to run flake8 and pytest when run locally * updated tests to have documentation about purpose and function as requested in #170 * fixed line length * Removal of nose from requirements. * updated to remove nose from the requirements

* Locking the version of PyYAML to be above 2.1 because of an arbitrary code execution vulnerability. * Addition of a version condition to pyyaml to patch a vulnerability. * Update of Pipfile.lock to match Pipefile.

Fixes #173 * Addition of a loader to the yaml load call. * Addition of a catch if the loader attribute is missing.

* Moved enum34 to condition dependent on Python<3.4. * Addition of conditional enum34 install for requirements.txt. * Correction of requirements.txt syntax for python version.

* Addition of a Dockerfile for quick tutorials. * Tweaks for Docker and addition of git. * Tweak to Docker file for caching. * Addition of Docker documentation. * Tweaks to Docker documentation. * Removal of markdown ##

…ten. (#181) * Take out shebang from shell definiton and at it when script is written. * Include shebang in cmd and fix format of string written to file.

* Extension of shebang feature to allow users to specify shells. * Addition of debug message to print kwargs. * Addition of kwargs. * Addition of basic batch settings to LULESH sample. * Addition of kwargs to Flux adapters. * Docstring tweaks. * Docstring update.

* Docstring correction for LocalAdapter. * Correction to addition of exec line at top of scripts.

* Addition of a Record class for storing general data. * Addition of SubmissionRecord type. * Update to the order of for record parameters. * Changes to StepRecord to expect SubmissionRecord returns. * Updates to SLURM and local adapters to use SubmissionRecords. * Slight tweak to LocalAdapter docstring. * Tweak to have SubmissionRecord initialize its base. * Addition of CancellationRecord class. * Changes to CancellationRecord to map based on status. * Additional interface additions and tweaks. * Changes to have cancel use CancellationRecords. * Update to ExecutionGraph to use records. * Updates to SLURM and local adapters to use SubmissionRecords. * Slight tweak to LocalAdapter docstring. * Addition of CancellationRecord class. * Additional interface additions and tweaks. * Changes to have cancel use CancellationRecords. * Cherry pick of execution commit. * Removal of redundant "get" definiton.

FrankD412 and others added 30 commits June 3, 2018 19:24

Update setup.py to 1.1.3dev

3ba67fd

Updated the getting started documentation

fe3b1ec

Added examples for running both lulesh specs

bc47d46

Workflow setup fix (#117)

b05d4c8

* Fixed a workflow setup issue caused by spaces in the spec name * Additional dirname formatting

Tweaks and addition of starting to SpectrumFluxAdapter. (#119)

6fbe0b1

Updated the LULESH examples to use the LULESH git repository for clon…

aa28222

…ing the repo (#121)

Additional debug logging for the Flux Spectrum Adapter (#122)

c144d63

* Additional logging for debugging of the adapter. * Additional logging.

Updated Exception.message references to Exception.args (#125)

ceb9f9b

Fix status.csv 'State' column writeout. (#127)

8b25484

Bugfix that would cause the ExecutionGraph to not update when cancell…

bb3b49d

…ed. (#131)

Correction to a bug for attempting a restart when a restart isn't spe…

786d523

…cified. (#133) * Initial fix for steps that cannot restart. * Comments to clarify logic.

More flexible ExecutionGraph description API and logging of descripti…

fceeffa

…on. (#137) * Tweaks to how the ExecutionGraph adds descriptions. Added logging. * Updated study to use the new API and some minor refactor. * Fixed #134

Fix to Variable verification. (#138)

8b7977b

Fixes #123

Throttled workflows push steps into the ready queue multiple times. (#…

fa988bf

…136) * Moved 'mark_submitted' into ExecutionGraph 'execute' * Conversion of ready_steps deque to use step names. * Capping the available slots to the length of ready_steps.

Update version number to 1.1.3.

0b0d333

Addition of user enabled workspace hashing (#145)

fd990ce

* Addition of hashing to Study parameterization. * Addition of the hashws option to argparse. * Addition of a warning note for users who use labels in steps.

Update setup.py to 1.1.4dev

9c401aa

More generalized FluxScriptAdapter (#149)

31ebde8

* Addition of a more general flux ScriptAdapter. * Addition of some casting from int to str * Corrected "gpus" to "ngpus" * Rework jobspec construction to make a valid jobspec. * Check for empty value for cores per task.

README tweak to update quickstart link. (#139)

3c26982

typos. fixes #141 (#154)

3857ce2

Correction of flake8 style errors [new version of flake8].

bd9624e

Update to setup.py to reflect dev version 1.0

91dbe65

robinson96 and others added 3 commits November 13, 2018 13:53

do not overwrite log file. (#162)

ff39d1e

Signed-off-by: Peter Robinson <[email protected]>

Added confirmation message after launching a study (#163)

c52540e

Enhancements to store relative pathing in the metadata file. (#165)

1b5c55c

* Changes to make workspaces reflect relative pathing based on step names. * Addition of an alternative output format based on step combinations.

FrankD412 added In Progress Issue or PR that is currently in active development. Feature labels Dec 14, 2018

FrankD412 self-assigned this Dec 14, 2018

FrankD412 and others added 6 commits January 8, 2019 09:45

Addition of tag to LULESH git dependency. (#169)

4bacb28

PyYAML vulnerability fix (#171)

b411a72

* Locking the version of PyYAML to be above 2.1 because of an arbitrary code execution vulnerability. * Addition of a version condition to pyyaml to patch a vulnerability. * Update of Pipfile.lock to match Pipefile.

Minor tweak to indentation for flake8 failure.

f477dd4

fixed pyyaml to requirements (#172)

37311fd

Addition of a loader to the yaml load call. (#174)

b318d06

Fixes #173 * Addition of a loader to the yaml load call. * Addition of a catch if the loader attribute is missing.

FrankD412 added Blocked An issue that is blocked by another issue. and removed In Progress Issue or PR that is currently in active development. labels Mar 19, 2019

FrankD412 and others added 8 commits March 19, 2019 11:30

Correction to install enum34 for Python versions < 3.4 (#176)

ad6e1cd

* Moved enum34 to condition dependent on Python<3.4. * Addition of conditional enum34 install for requirements.txt. * Correction of requirements.txt syntax for python version.

Take out shebang from shell definition and add it when script is writ…

cc5278d

…ten. (#181) * Take out shebang from shell definiton and at it when script is written. * Include shebang in cmd and fix format of string written to file.

Tweaks to fix malformed log statements. (#182)

8e10b8f

Correction to message when stating no to launch.

be8536a

Addition of enumeration docstrings.

494b873

Addition of NOOP state for workflow steps.

f9c5a54

Addition of post step return enums.

db97000

FrankD412 force-pushed the feature/post_steps branch from fd0878f to db97000 Compare April 19, 2019 22:37

Francesco Di Natale and others added 6 commits April 25, 2019 14:33

Fixes the addition of the shebang header for SLURM (#184)

e55ac03

* Docstring correction for LocalAdapter. * Correction to addition of exec line at top of scripts.

Correction to an accidental reassignment of cmd.

c2e46c8

Removal of an assignment of self._exec in SLURM adapter.

2f5a47e

Merge branch 'develop' into feature/post_steps

35cfc39

FrankD412 force-pushed the develop branch from 01a8894 to 5ab8da0 Compare July 14, 2019 20:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduction of user defined post steps #166

Introduction of user defined post steps #166

FrankD412 commented Dec 14, 2018

Introduction of user defined post steps #166

Are you sure you want to change the base?

Introduction of user defined post steps #166

Conversation

FrankD412 commented Dec 14, 2018