Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClearML Agent as a systemd service #210

Open
niko-zvt opened this issue Jun 21, 2024 · 0 comments
Open

ClearML Agent as a systemd service #210

niko-zvt opened this issue Jun 21, 2024 · 0 comments

Comments

@niko-zvt
Copy link

niko-zvt commented Jun 21, 2024

Hello!

Please tell me if there is an example of configuring a service for systemd that implements work with clearml-agent. The service file with unit that I'm creating is unstable. Often, the agent simply falls off and cannot be restarted. Although the CLI commands clearml-agent deamon ... work perfectly separately.

  1. Could this be due to the fact that I explicitly specify the daemon sub-command?
  2. What options are there for managing/serving agents other than manually?

I have to use the agent as a service for two reasons:
a. When restarting the server, the agent doesn't start on its own, it must be started manually or a command call is prescribed after loading (which is not a good practice).
b. I still haven't figured out if I can use the agent inside the docker container (Docker-in-Docker). Since the agent itself uses docker to create isolated containers for tasks based on nvidia-cuda images.

clearml-agent-gpu.service

[Unit]
Description=ClearML Agent Service
After=docker.target

[Service]
Type=forking
User=ml-worker
WorkingDirectory=/home/ml-worker/clearml-agent-virtualenv
ExecStart=/home/ml-worker/clearml-agent-virtualenv/bin/clearml-agent daemon --detached --queue default --gpus all
ExecStop=/home/ml-worker/clearml-agent-virtualenv/bin/clearml-agent daemon --detached --queue default --gpus all --stop
Restart=always
Environment="PATH=/home/ml-worker/clearml-agent-virtualenv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"

[Install]
WantedBy=multi-user.target

systemctl output for sudo systemctl start clearml-agent-gpu + sudo systemctl status clearml-agent-gpu
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant