Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduction of user defined post steps #166

Open
wants to merge 57 commits into
base: develop
Choose a base branch
from

Conversation

FrankD412
Copy link
Member

This PR relates to issue #98 -- the addition of pre and post steps. Some initial exploration has revealed that the addition of post steps will be a non-trivial effort so I'm making this PR the base PR for other branches that will iteratively refactor and add the necessary functionality to build this up. So far here's what's holding up the feature:

  • The ScriptAdapter and SchedulerScriptAdapter abstract interfaces are cumbersome. The get_scheduler_command method, for example, takes in a step as its only argument and returns a Boolean with a path to the script that executes the command and a path to the restart script (if the step specifies a restart). There is no general way to simply generate a script currently other than refactoring or adding more returns to the get_scheduler_command method.
  • Script generation is too coupled with higher level information. The LocalScriptAdapter has the same underlying structure as any other SchedulerScriptAdapter and because of this requires things like a header. A little more generalization is needed so that an adapter's functionality should get more specific information passed to it instead of a whole step.
  • The SchedulerScriptAdapter really should be split up into a batch interface and a script interface and would help with the first bullet. The division of responsibility would open up the ability to generate simple scripts without the need of extra information and could be used to generate scripts as needed without having monolithic generation methods.

The split of responsibilities is already being started in a new branch post_steps/adapter_refactor, which will lead to tweaks to all the adapters and then the addition of general script generation to support post steps.

FrankD412 and others added 30 commits June 3, 2018 19:24
* Additional arguments can be passed through batch.

* Correction of the use of extend to append.

* Removal of OMPI vars from env.

* Reversal of env altering.

* Addition of mpi exe to batch.

* Removal of -gpu flag

* Addition of "allocated" to PENDING set.

* Correction of the check_status method call.

* Correction of the Flux import

* Addition of the EnvironmentError to try/catch for check_status.

* Addition of jobid to submission INFO logging.
* Fixed a workflow setup issue caused by spaces in the spec name

* Additional dirname formatting
* Minor docstring correction.

* Change to study setup API to break out workspace creation.

* Update Maestro frontend to use new Study method.

* Renamed Study.setup to be better communicate its functionality.

* Correction to style for configure_study method.

* Moved environment application to add_step.

* Split out acquiring environment elements to its own method.

* Updates to staging checks.

* Renamed _setup_linear and _setup_parameterized from setup to stage for clarity.
* Addition of method and tweaks to support metadata.

* Addition of call to the method for generating metadata.

* Correction to encode YAML string for py3 compatibility.

* Another attempt at py2-py3 compatible encoding.

* Addition of load_metadata method.
* Additional logging for debugging of the adapter.

* Additional logging.
* #107 Added StudyStatus enumeration
Changed DAG execute_ready_steps to return StudyStatus enum instead of a boolean
conductor and maestro also accept StudyStatus and convert to return value

* #128 fixed issues found in pull request

* #107 added inline comments per PC

* #107 pulling out _check_study_completion
to avoid initial implementaion copy and paste error

* #107 fixed line length issue
* Command line parameter for custom ParameterGenerators.

* Addition of sample custom parameter generation.

* Correction to missing parameters in SIZE.

* Addition of utility method for importing custom gen py files.

* Correction of syntax.

* Conversion of DOS newlines to Unix.

* Addition of call to check for custom code.

* Correction of a bad docstring.

* Addition of parameter metadata to Maestro.

* Addition of parameters to study metadata.

* Addition of copying of the parameter file to study workspace.
…cified. (#133)

* Initial fix for steps that cannot restart.

* Comments to clarify logic.
…on. (#137)

* Tweaks to how the ExecutionGraph adds descriptions. Added logging.

* Updated study to use the new API and some minor refactor.

* Fixed #134
* Updates to the README.

* Added initial getting started example documentation

* Addition of Quick Start documentation. (#130)

* Addition of Quick Start.

* Addition of cancellation status example.

* Corrections to quick start and fix for build errors.

* Addition of a LULESH study specification with heavier comments. (#132)

* Start of a LULESH spec that's commented.

* Addition of explanation of step structure.

* Additional comments about steps and dependencies.

* Minor tweaks to comments.

* Addition of parameter block comments.

* Addition of reference to launcher_token YAML

* Tweaks and edits based on feedback.
…136)

* Moved 'mark_submitted' into ExecutionGraph 'execute'

* Conversion of ready_steps deque to use step names.

* Capping the available slots to the length of ready_steps.
* Addition of a shell util to centralize process creation settings.

* Switch to start_process method.

* Addition of docstring for start_process.

* Update to fix universal newline from false to true.

* Addition of a check for a list and a hard set of shell to False.

* Made conductor launch command a string.

* Fix behavior of start_process for localscriptadapter.

The shell=False flag was not being passed, nor were the env or cwd
arguments being used in the actual function.

* Clean up of start_process
* Addition of a ping method.

* Tweaks to error logging.

* Moved specific existence exception out of return code check.

* Cleanup of some merge characters.

* Tweak for connectivity checking.

* Addition of info to the error message.

* Removal of git return codes since they don't mean anything.
* Addition of hashing to Study parameterization.

* Addition of the hashws option to argparse.

* Addition of a warning note for users who use labels in steps.
* Addition of a more general flux ScriptAdapter.

* Addition of some casting from int to str

* Corrected "gpus" to "ngpus"

* Rework jobspec construction to make a valid jobspec.

* Check for empty value for cores per task.
robinson96 and others added 3 commits November 13, 2018 13:53
* Changes to make workspaces reflect relative pathing based on step names.

* Addition of an alternative output format based on step combinations.
@FrankD412 FrankD412 added In Progress Issue or PR that is currently in active development. Feature labels Dec 14, 2018
@FrankD412 FrankD412 self-assigned this Dec 14, 2018
FrankD412 and others added 6 commits January 8, 2019 09:45
Fixes #167 

* added pytest to requirements

added Pipfile and pipenv settings

* Added property key to Abstract.ScriptAdapter (#167)
Also added impementation and tests to verify that existing functionality isn't changed

* updated factory to use key when registering adapters(#167)

* cleanedup linelength

* cleaned up imports to be specific to module (#167)

* added tests to verify exception for unknown adapter

* moved adapters tests to individual files

* added test to verify scriptadapter functionality (#167)

updated gitignore to have testing and pycharm ignores

testing existing adapters in factory (#167)

added test to verify factories.keys matches get_valid_adapters (#167)

added copyright to file

* updated __init__ modules to do dynamic includes

* removed unneeeded imports

* updated dependency versions

* fixed all flake8 errors

* updated to run flake8 and pytest when run locally

* updated tests to have documentation about purpose and function as requested in #170

* fixed line length

* Removal of nose from requirements.

* updated to remove nose from the requirements
* Locking the version of PyYAML to be above 2.1 because of an arbitrary code execution vulnerability.

* Addition of a version condition to pyyaml to patch a vulnerability.

* Update of Pipfile.lock to match Pipefile.
Fixes #173 

* Addition of a loader to the yaml load call.

* Addition of a catch if the loader attribute is missing.
@FrankD412 FrankD412 added Blocked An issue that is blocked by another issue. and removed In Progress Issue or PR that is currently in active development. labels Mar 19, 2019
FrankD412 and others added 8 commits March 19, 2019 11:30
* Moved enum34 to condition dependent on Python<3.4.

* Addition of conditional enum34 install for requirements.txt.

* Correction of requirements.txt syntax for python version.
* Addition of a Dockerfile for quick tutorials.

* Tweaks for Docker and addition of git.

* Tweak to Docker file for caching.

* Addition of Docker documentation.

* Tweaks to Docker documentation.

* Removal of markdown ##
…ten. (#181)

* Take out shebang from shell definiton and at it when script is written.

* Include shebang in cmd and fix format of string written to file.
@FrankD412 FrankD412 force-pushed the feature/post_steps branch from fd0878f to db97000 Compare April 19, 2019 22:37
Francesco Di Natale and others added 6 commits April 25, 2019 14:33
* Extension of shebang feature to allow users to specify shells.

* Addition of debug message to print kwargs.

* Addition of kwargs.

* Addition of basic batch settings to LULESH sample.

* Addition of kwargs to Flux adapters.

* Docstring tweaks.

* Docstring update.
* Docstring correction for LocalAdapter.

* Correction to addition of exec line at top of scripts.
* Addition of a Record class for storing general data.

* Addition of SubmissionRecord type.

* Update to the order of for record parameters.

* Changes to StepRecord to expect SubmissionRecord returns.

* Updates to SLURM and local adapters to use SubmissionRecords.

* Slight tweak to LocalAdapter docstring.

* Tweak to have SubmissionRecord initialize its base.

* Addition of CancellationRecord class.

* Changes to CancellationRecord to map based on status.

* Additional interface additions and tweaks.

* Changes to have cancel use CancellationRecords.

* Update to ExecutionGraph to use records.

* Updates to SLURM and local adapters to use SubmissionRecords.

* Slight tweak to LocalAdapter docstring.

* Addition of CancellationRecord class.

* Additional interface additions and tweaks.

* Changes to have cancel use CancellationRecords.

* Cherry pick of execution commit.

* Removal of redundant "get" definiton.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Blocked An issue that is blocked by another issue. Feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants