App Runner B-Fabric integration #126

leoschwarz · 2025-01-16T15:47:51Z

separate the submitter related parameters out of the raw_parameters into a dedicated WorkunitDefintition field
workunit.mk should set the correct bfabric instance into an environment variable
probably redundant call (workunit save status processing)
~~check if we can organize the registered data a bit better (actual executable rather than yaml, and wrapper creator <-> submitter split)`~~ i merged the two components

Essentially, this PR implements a new submitter/wrapper creator pair for bfabric-app-runner conformant apps. Basically, it makes it super easy to execute any app which is defined by an app_definition.yml into B-Fabric.

The main benefits are the following:

Full support of slurm configuration, at submitter, application, and workunit level. -> This will allow us to use our resources more efficiently.
Configuration is almost fully contained in a YAML file which allows to deploy this for the test system without conflicts. We even "fake" the output staging to a SSH Host and directory, without necessitating the configuration change in B-Fabric test. -> We can run our apps on the bfabric-test instance now.
Workunits can automatically be linked with a viewer for live log updates. -> This will simplify the logging in the future, because there is only one log source that needs to be handled.
Application folders in /scratch are automatically created from the configuration only. -> This is a further step towards more ephemeral scratch directories which will allow to scale our data processing more.

Caveats:

This new functionality is not available for legacy apps.
- At least testing the legacy apps on the test system will be problematic, since they manage their input/output staging and scratch folder allocation and thus it cannot be avoided that they will write to the production stores easily.
- We could implement a compatibility layer with the above caveat, but except for avoiding a duplication of efforts I don't really see much to be gained.
We do not create resource entities, for the logs directly, but rather links.
Some specifics are still embedded in the shell script which is autogenerated, though it is arguably a lot more generic than it was before.

leoschwarz mentioned this pull request Jan 16, 2025

[draft] Configurable submitter #123

Closed

leoschwarz added 29 commits January 28, 2025 10:53

initial code

13ce417

rename

a2cd886

add some implementation for the submitters_spec.py

e4c6fac

create submitter module

6d2f997

be a bit more pydantic

30778ee

also ensure the user params get validated

6a0328c

more explicit names

86c407a

specify the slurm root

9057229

implement the submitter

7bd6b38

use StringConstraints

a10f3b0

add input classes for the submitter

b70c952

add some very early draft on the submitter design

d536125

implement app_runner wrapper creator

8613af2

integrate initial entrypoint

f4bd898

initial executable definition

3347b08

directly set masterexecutableid

a8de5ef

necessary flags

b5df77e

update

54e6865

update

7cfe16f

argparse

bd9500a

use the executable

7a885b4

define default

75b4bcc

don't pass parameters

21a4e4f

pass workunitid

352e0cd

give it a name

25dfa2b

add a new line

d07767d

add definition

029681d

add skeleton submitter

5f47cd0

start implementing

206ae62

leoschwarz added 30 commits February 24, 2025 09:09

Merge branch 'main' into refactor-submitter

aa528b2

Merge branch 'main' into refactor-submitter

eb3835a

create resource instead of link

01ab962

Merge branch 'main' into refactor-submitter

ee9bec8

missing workunit_id

8974a9b

improve state logic

154175a

Handle problematic characters in Workunit.store_output_folder

dad2d02

try again

531b198

Merge branch 'main' into refactor-submitter

ecd6e36

it's still good enough to use half of it

779caf3

fix

0e16fa8

start preparing for new impl

d8f71de

setup slurm_submitter package

9910f58

better template

1841dfe

basically merge the two components

5e2c2b1

deactivate

8720fc7

fix imports

c514bff

clean

551f5f8

cleaning

6b75950

integrate

78de0d2

wrong arg

cf49f2f

fix

466bc95

fix

a31e95e

path

74ec001

path

b484541

typo

b0a550d

fix param

0e70db2

move

38be212

refactor

8dd0ef3

configurable name

d14585b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

App Runner B-Fabric integration #126

App Runner B-Fabric integration #126

leoschwarz commented Jan 16, 2025 •

edited

Loading

App Runner B-Fabric integration #126

Are you sure you want to change the base?

App Runner B-Fabric integration #126

Conversation

leoschwarz commented Jan 16, 2025 • edited Loading

leoschwarz commented Jan 16, 2025 •

edited

Loading