RAM / CPU cores partitioning for multiple agents on the same machine #176

mkurczew · 2023-11-20T10:18:28Z

Hi,
I have several workstations with many CPU cores and a lot of RAM. All agents run Ubuntu.
I would like to run multiple ClearML agents (barebones, no k8s) on each of the workstations.

Can I somehow prevent one agent running a job from hoarding all of the resources (e.g. CPU cores) and guarantee each agent a minimum quota (prefably dynamic e.g. at least 8 cores and 1/3rd of RAM but more if avaialble)?

Or, alternatively, can I prevent the accidental scheduling of another job to the machine which is bogged down by other tasks?

Is there any way to achieve that with ClearML?
Documentation suggests that the only resource agents "manage" is GPUs.

EDIT: Just started to wonder, should I file it here or under ClearML project?

ainoam · 2023-11-23T17:42:51Z

This is probably the right place @mkurczew :)

To police the resources available to an agent you can make use of extra_docker_arguments for example:

extra_docker_arguments=["--memory=1g", "--cpus=2"]

(see https://docs.docker.com/config/containers/resource_constraints/)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RAM / CPU cores partitioning for multiple agents on the same machine #176

RAM / CPU cores partitioning for multiple agents on the same machine #176

mkurczew commented Nov 20, 2023 •

edited

Loading

ainoam commented Nov 23, 2023

RAM / CPU cores partitioning for multiple agents on the same machine #176

RAM / CPU cores partitioning for multiple agents on the same machine #176

Comments

mkurczew commented Nov 20, 2023 • edited Loading

ainoam commented Nov 23, 2023

mkurczew commented Nov 20, 2023 •

edited

Loading