You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I manually install the older version of nvidia-container-toolkit on the broken runner i-0d3ed1ff3ccbeec77 with apt-get install nvidia-container-toolkit=1.16.2-1 nvidia-container-toolkit-base=1.16.2-1 to get the runner up now
description
when AO test runs with h100 it's not consistent during the linux test job, when the image is 'legacy', it causes problem
Error Peak
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running prestart hook #0: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: error parsing IMEX info: unsupported IMEX channel value: all: unknown.
example
AO test
failure example: https://github.com/pytorch/ao/actions/runs/13042725250/job/36387841392
success exmaple: https://github.com/pytorch/ao/actions/runs/12999348107/job/36254475921
The text was updated successfully, but these errors were encountered: