Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Singularity pull before job starts #36

Open
matbun opened this issue Nov 11, 2024 · 0 comments
Open

Singularity pull before job starts #36

matbun opened this issue Nov 11, 2024 · 0 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@matbun
Copy link

matbun commented Nov 11, 2024

Short Description of the issue

As of now, pulling of Docker container images and their conversion to SIF is done in the slurm-job.vk.io/pre-exec annotation, which in turn is injected in the SLURM job script which represents the k8s pod.

IMO, this has two disadvantages:

  1. Error-prone: it requires the user to remember to pull the right image and update the pod accordingly. Also, as of now, the user has to remember to pull the image when the image is not there, and not to pull it if the image is already there (to avoid unnecessary pull and conversion overheads).
  2. Potential waste of compute time. Downloading and converting a Docker image to a SIF file takes minutes (around 10 minutes if I well remember), which is compute time you get charged for by the HPC while not actually using their resources. The waste of compute time increases as the resources allocation (i.e., nodes) increases.

Summary of proposed changes

Creating a dedicated annotation to manage the container image (e.g., pull and convert if not already there) before the SLURM job starts.

@dciangot dciangot added enhancement New feature or request good first issue Good for newcomers labels Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants