Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hami will not schedule Pods that are in Pending #812

Open
kebe7jun opened this issue Jan 15, 2025 · 1 comment
Open

hami will not schedule Pods that are in Pending #812

kebe7jun opened this issue Jan 15, 2025 · 1 comment
Labels
kind/bug Something isn't working

Comments

@kebe7jun
Copy link

What happened:

When the node is rebooted, hami may not be able to schedule the pods because of the reported status. After I restart the scheduler and device plugin manually, I need to remove the pods in Pending before scheduling them, or else the pods will be in Pending state all the time.

What you expected to happen:

Automatically try to schedule Pods in Pending state after hami is normal, instead of letting them stay Pending.

How to reproduce it (as minimally and precisely as possible):

See upon.

Anything else we need to know?:

  • The output of nvidia-smi -a on your host
  • Your docker or containerd configuration file (e.g: /etc/docker/daemon.json)
  • The hami-device-plugin container logs
  • The hami-scheduler container logs
  • The kubelet logs on the node (e.g: sudo journalctl -r -u kubelet)
  • Any relevant kernel output lines from dmesg

Environment:

  • HAMi version: v2.4.0
  • nvidia driver or other AI device driver version:
  • Docker version from docker version
  • Docker command, image and tag used
  • Kernel version from uname -a
  • Others:
@kebe7jun kebe7jun added the kind/bug Something isn't working label Jan 15, 2025
@lengrongfu
Copy link
Member

let me see

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants