Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

App Runner B-Fabric integration #126

Draft
wants to merge 202 commits into
base: main
Choose a base branch
from
Draft

App Runner B-Fabric integration #126

wants to merge 202 commits into from

Conversation

leoschwarz
Copy link
Collaborator

@leoschwarz leoschwarz commented Jan 16, 2025

  • separate the submitter related parameters out of the raw_parameters into a dedicated WorkunitDefintition field
  • workunit.mk should set the correct bfabric instance into an environment variable
  • probably redundant call (workunit save status processing)
  • check if we can organize the registered data a bit better (actual executable rather than yaml, and wrapper creator <-> submitter split)` i merged the two components

Essentially, this PR implements a new submitter/wrapper creator pair for bfabric-app-runner conformant apps. Basically, it makes it super easy to execute any app which is defined by an app_definition.yml into B-Fabric.

The main benefits are the following:

  • Full support of slurm configuration, at submitter, application, and workunit level. -> This will allow us to use our resources more efficiently.
  • Configuration is almost fully contained in a YAML file which allows to deploy this for the test system without conflicts. We even "fake" the output staging to a SSH Host and directory, without necessitating the configuration change in B-Fabric test. -> We can run our apps on the bfabric-test instance now.
  • Workunits can automatically be linked with a viewer for live log updates. -> This will simplify the logging in the future, because there is only one log source that needs to be handled.
  • Application folders in /scratch are automatically created from the configuration only. -> This is a further step towards more ephemeral scratch directories which will allow to scale our data processing more.

Caveats:

  • This new functionality is not available for legacy apps.
    • At least testing the legacy apps on the test system will be problematic, since they manage their input/output staging and scratch folder allocation and thus it cannot be avoided that they will write to the production stores easily.
    • We could implement a compatibility layer with the above caveat, but except for avoiding a duplication of efforts I don't really see much to be gained.
  • We do not create resource entities, for the logs directly, but rather links.
  • Some specifics are still embedded in the shell script which is autogenerated, though it is arguably a lot more generic than it was before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant