You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@rpieczon just to clarify, are you saying if any pod (even if unassociated with Akri) is unready, it causes this slot reconciliation error? From what i remember slot reconciliation should only check pods with an expected annotation.
i lose track a little, but the annotations are on the container, not the pod i think ... and it might be that an unready pod is considered a potential place where an annotated container could eventually exist. might be worth looking at the resource requests to limit where this early exit happens.
might be hard to check for the resource though. if the pod isn't ready and the container doesn't exist, there isn't much context to check the instances against.
@rpieczon just to clarify, are you saying if any pod (even if unassociated with Akri) is unready, it causes this slot reconciliation error? From what i remember slot reconciliation should only check pods with an expected annotation.
Exactly in my case I have failing Prometheus POD which has zero requirements related with USB allocation.
Describe the bug
Akri agent daemonset keeps reporting following error whenever any of pod running on a cluster is not ready.
2023-11-16T13:44:46Z TRACE agent::util::slot_reconciliation] reconcile - Pods with unready Containers exist on this node, we can't clean the slots yet
In my case failing POD doesn't use USB resources.
Output of
kubectl get pods,akrii,akric -o wide
Kubernetes Version: [e.g. Native Kubernetes 1.19, MicroK8s 1.19, Minikube 1.19, K3s]
Expected behavior
I would expect reconciliation process can be continue if failing pod is out of usb usage.management context.
The text was updated successfully, but these errors were encountered: