Skip to content

A rich, visual interface for easily starting and monitoring your Hydra applications on SLURM clusters

License

Notifications You must be signed in to change notification settings

creinders/hydra-slurm-rich-launcher

Repository files navigation

Hydra Slurm Rich Launcher

A rich, visual interface for easily starting and monitoring your Hydra applications on SLURM clusters.

  • Ease of Use: Streamline your workflow with a simplified process for submitting jobs to SLURM
  • Rich Visualization: A clear and beautiful visual overview of your jobs
  • Integration: Seamlessly integrates with Hydra-powered CLIs
  • Real-Time Updates: Monitor the status of your jobs in real-time

Installation

The Hydra Slurm Rich Launcher can be installed via pip:

pip install hydra-slurm-rich-launcher --upgrade
Alternative installation methods

Locally

```
git clone [email protected]:creinders/hydra-slurm-rich-launcher.git
cd hydra-slurm-rich-launcher
poetry install
```

Quick Start

Define your configuration in config.yaml:

defaults:
  - override hydra/launcher: slurm_rich
hydra:
  launcher:
    partition: <SLURM_PARTITION>

task: 1

Implement your Hydra app in my_app.py:

import hydra

@hydra.main(config_path=".", config_name="config", version_base="1.3")
def my_app(cfg) -> None:
    print(f"Task: {cfg.task}")

if __name__ == "__main__":
    my_app()

Starting the app with task=1,2,4 will launch three jobs with different configurations:

python my_app.py task=1,2,4 --multirun

example

Please see the Hydra documentation for details regarding the configuration and multi-run.

Scalability

Lots of run? No problem! Hydra Slurm Rich Launcher smartly organizes all of your runs.

Scalability

Restarts

Easily monitor the status of your jobs and swiftly restart any failed runs.

Restarts

Parameters

The Hydra Slurm Rich Launcher has the following parameters.

slurm_query_interval_s: 15 #  Query update interval from SLURM controller
filter_job_ids: null # Filter specific jobs from the job array, separated by comma (e.g., "1,4"), that should not be executed
retry_strategy: 'prompt'  # Defines job retry strategy. 'prompt': will ask the user, 'never': never restarts, and 'always': restarts the runs automatically
max_retries: 3 # Maximum retry attempts
le_mode: 'auto'  # Low energy mode settings. The low energy mode disables all animations and can be turned on if the cpu-usage must be minimized. Values are: 'on', 'off', and 'auto'. 'auto' will turn on the low energy mode if the environment variable HYDRA_SLURM_PROGRESS_LE_MODE is set.

submitit_folder: ${hydra.sweep.dir}/.submitit/%j
timeout_min: 60
cpus_per_task: null
gpus_per_node: null
tasks_per_node: 1
mem_gb: null
nodes: 1
name: ${hydra.job.name}
partition: null
qos: null
comment: null
constraint: null
exclude: null
gres: null
cpus_per_gpu: null
gpus_per_task: null
mem_per_gpu: null
mem_per_cpu: null
account: null
signal_delay_s: 120
max_num_timeout: 0
additional_parameters: {}
array_parallelism: 256
setup: null

License

Hydra Slurm Rich Launcher is licensed under MIT License.

Credits

This package was inspired by and extends the capabilities of the hydra-submitit-launcher. We gratefully acknowledge the developers of hydra-submitit-launcher and Hydra for their contributions to the open-source community.

About

A rich, visual interface for easily starting and monitoring your Hydra applications on SLURM clusters

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages