You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When executing tasks using the clearml-agent within a Docker container, we encounter a failure during operations that attempt to write to the .gitconfig file. Specifically, the command git config --global --replace-all safe.directory '*' fails with the error message could not write config file /root/.gitconfig: Device or resource busy. This issue persists even though manual tests for file access, read, and write operations to /root/.gitconfig succeed when performed within the container.
The failure to write to .gitconfig seems to occur only during the execution of automated tasks by clearml-agent, suggesting a possible issue with how file access or locking is managed in the context of Docker containers orchestrated by clearml-agent.
Steps to Reproduce
Execute a clearml-agent task within a Docker container that requires Git operations.
The task fails when attempting to globally configure Git to recognize all directories as safe, with the specific command being git config --global --replace-all safe.directory '*'.
Additional Context
We have enabled GIT_TRACE=1 for more detailed output on Git operations.
The issue appears to be related to the clearml-agent's interaction with the .gitconfig file within Docker containers, particularly concerning file locking or access permissions.
Deleting the vcs_cache directory allows the task to proceed successfully, suggesting the problem may be linked to the caching mechanism or file access within this cache.
This behavior raises concerns about potential issues with file locking, .gitconfig access, or interactions between Docker, the clearml-agent, and Git within the containerized environment.
The agent is running on an EC2 instance and we are using environment variables to configure the agent:
export CLEARML_AGENT_GIT_USER=<user_name>export CLEARML_AGENT_GIT_PASS=<github_pat_our_pat_token>export CLEARML_EXTRA_PIP_INSTALL_FLAGS="--extra-index-url=https://<aws_account_id>.d.codeartifact.eu-central-1.amazonaws.com/pypi/st-python-packages/simple/"export CLEARML_API_HOST="https://api.clearml.<address>.com"export CLEARML_WEB_HOST="https://app.clearml.<address>.com"export CLEARML_FILES_HOST="https://files.clearml.<address>"export CLEARML_API_ACCESS_KEY=<access_key>export CLEARML_API_SECRET_KEY=<secret_key>export CLEARML_DEFAULT_OUTPUT_URI="s3://our_bucket"export CLEARML_DOCKER_IMAGE="<aws_account_id>.dkr.ecr.eu-central-1.amazonaws.com/python-secure:3.10-slim"# Collect all environment variables starting with CLEARML and join them with a comma
CLEARML_ENV_VARS=$(env | grep ^CLEARML | cut -d '=' -f 1 | tr '\n'','| sed 's/,$//')# Set the CLEARML_AGENT_DOCKER_ARGS_HIDE_ENV variable with the collected namesexport CLEARML_AGENT_DOCKER_ARGS_HIDE_ENV=$CLEARML_ENV_VARSexport CLEARML_WORKER_NAME=""export CLEARML_WORKER_ID=""export CLEARML_AGENT_EXTRA_DOCKER_ARGS=""
We pass the pat token to the environment CLEARML_AGENT_GIT_PASS
Interactions between Docker volume mounts (especially for .gitconfig and vcs_cache) and the clearml-agent's file handling.
How the clearml-agent manages Git configurations and operations within Docker containers, particularly regarding global settings and cached environments.
The text was updated successfully, but these errors were encountered:
Description
When executing tasks using the clearml-agent within a Docker container, we encounter a failure during operations that attempt to write to the
.gitconfig
file. Specifically, the commandgit config --global --replace-all safe.directory '*'
fails with the error messagecould not write config file /root/.gitconfig: Device or resource busy
. This issue persists even though manual tests for file access, read, and write operations to/root/.gitconfig
succeed when performed within the container.The failure to write to
.gitconfig
seems to occur only during the execution of automated tasks by clearml-agent, suggesting a possible issue with how file access or locking is managed in the context of Docker containers orchestrated by clearml-agent.Steps to Reproduce
git config --global --replace-all safe.directory '*'
.Additional Context
GIT_TRACE=1
for more detailed output on Git operations..gitconfig
file within Docker containers, particularly concerning file locking or access permissions.vcs_cache
directory allows the task to proceed successfully, suggesting the problem may be linked to the caching mechanism or file access within this cache..gitconfig
access, or interactions between Docker, the clearml-agent, and Git within the containerized environment.CLEARML_AGENT_GIT_PASS
Environment
python:3.10-slim
Error Logs
Potential Areas for Investigation
.gitconfig
andvcs_cache
) and the clearml-agent's file handling.The text was updated successfully, but these errors were encountered: