-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
helm安装Crane-scheduler 作为第二个调度器,使用官网示例测试pod没有被调度,一直卡在”Pending“状态 #50
Comments
请检查下crane-scheduler的pod的状态,是否running |
kubectl get pods -n crane-system |
从日志没看到异常。 |
测试过了,默认调度器没有问题可以正常调度 |
能否把完整的日志发上来,包括crane-scheduler-controller-6987688d8d-6wr7c和crane-scheduler-b84489958-6jdj6 |
遇到了同样的问题,使用的k8s版本为1.27 请大佬帮忙指点一下吧,感谢 |
应该是高版本的兼容性问题,目前1.25以下的集群没有问题,更高的集群可能要额外支持。 |
好的,感谢 |
我的kubernetes版本为1.20.7,使用的crane-scheduler镜像版本为0.0.20,作为第二个调度器使用。节点的annotation中已经有了聚合指标·。当我创建新的pod测试调度时,pod一直处于pending状态 crane-scheduler日志: crane-scheduler-controller日志: |
可能是没有关闭第二调度器的leaderelection。 |
确实是第二调度器没有关闭leaderelection导致的。但不是scheduler-deployment.yaml中的leaderelection,是scheduler-configmap.yaml中的leaderelection没关闭 |
我的kubernetes版本为1.22.12,使用的crane-scheduler镜像版本为scheduler-0.2.2,作为第二个调度器使用。节点的annotation中已经有了聚合指标·。也将leaderelection改为false了,但是当我创建新的pod测试调度时,pod一直处于pending状态。 Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 15s crane-scheduler 0/1 nodes are available: 1 Insufficient cpu. leaderelection: # Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
scheduler-config.yaml: |
apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: KubeSchedulerConfiguration
leaderElection:
leaderElect: false
profiles:
- schedulerName: crane-scheduler
plugins:
filter:
enabled:
- name: Dynamic
score:
enabled:
- name: Dynamic
weight: 3
crane-scheduler日志: I1226 09:47:56.595597 1 serving.go:348] Generated self-signed cert in-memory
W1226 09:47:57.035592 1 client_config.go:617] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I1226 09:47:57.041561 1 server.go:139] "Starting Kubernetes Scheduler" version="v0.0.0-master+$Format:%H$"
I1226 09:47:57.044642 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I1226 09:47:57.044658 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I1226 09:47:57.044666 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I1226 09:47:57.044679 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I1226 09:47:57.044699 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I1226 09:47:57.044715 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I1226 09:47:57.045160 1 secure_serving.go:200] Serving securely on [::]:10259
I1226 09:47:57.045218 1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
I1226 09:47:57.145093 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I1226 09:47:57.145152 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
I1226 09:47:57.145100 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file crane-scheduler-controller日志: root@master:/home/ubuntu/kube-prometheus/manifests# kubectl logs -n crane-system crane-scheduler-controller-6f6b94c8f7-79vff
I1226 17:47:56.187263 1 server.go:61] Starting Controller version v0.0.0-master+$Format:%H$
I1226 17:47:56.188316 1 leaderelection.go:248] attempting to acquire leader lease crane-system/crane-scheduler-controller...
I1226 17:48:12.646241 1 leaderelection.go:258] successfully acquired lease crane-system/crane-scheduler-controller
I1226 17:48:12.747072 1 controller.go:72] Caches are synced for controller
I1226 17:48:12.747174 1 node.go:46] Start to reconcile node events
I1226 17:48:12.747208 1 event.go:30] Start to reconcile EVENT events
I1226 17:48:12.773420 1 node.go:75] Finished syncing node event "master/cpu_usage_avg_5m" (26.154965ms)
I1226 17:48:12.794854 1 node.go:75] Finished syncing node event "master/cpu_usage_max_avg_1h" (21.278461ms)
I1226 17:48:12.818035 1 node.go:75] Finished syncing node event "master/cpu_usage_max_avg_1d" (23.146517ms)
I1226 17:48:12.837222 1 node.go:75] Finished syncing node event "master/mem_usage_avg_5m" (19.151134ms)
I1226 17:48:13.055018 1 node.go:75] Finished syncing node event "master/mem_usage_max_avg_1h" (217.762678ms)
I1226 17:48:13.455442 1 node.go:75] Finished syncing node event "master/mem_usage_max_avg_1d" (400.366453ms)
I1226 17:51:12.788539 1 node.go:75] Finished syncing node event "master/mem_usage_avg_5m" (41.092765ms)
I1226 17:51:12.810824 1 node.go:75] Finished syncing node event "master/cpu_usage_avg_5m" (22.248821ms)
I1226 17:54:12.771140 1 node.go:75] Finished syncing node event "master/mem_usage_avg_5m" (22.840662ms)
I1226 17:54:12.789918 1 node.go:75] Finished syncing node event "master/cpu_usage_avg_5m" (18.740179ms)
I1226 17:57:12.773735 1 node.go:75] Finished syncing node event "master/mem_usage_avg_5m" (26.395777ms)
I1226 17:57:12.792897 1 node.go:75] Finished syncing node event "master/cpu_usage_avg_5m" (19.124323ms)
I1226 18:00:12.772243 1 node.go:75] Finished syncing node event "master/mem_usage_avg_5m" (24.369461ms)
I1226 18:00:12.804297 1 node.go:75] Finished syncing node event "master/cpu_usage_avg_5m" (32.008004ms)
I1226 18:03:12.774690 1 node.go:75] Finished syncing node event "master/mem_usage_max_avg_1h" (27.291591ms)
I1226 18:03:12.795145 1 node.go:75] Finished syncing node event "master/mem_usage_avg_5m" (20.350165ms)
I1226 18:03:12.813508 1 node.go:75] Finished syncing node event "master/cpu_usage_avg_5m" (18.32638ms)
I1226 18:03:12.833109 1 node.go:75] Finished syncing node event "master/cpu_usage_max_avg_1h" (19.549029ms) |
helm安装Crane-scheduler 作为第二个调度器,使用官网示例测试pod没有被调度,一直卡在”Pending“状态:
1、部署yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: cpu-stress
spec:
selector:
matchLabels:
app: cpu-stress
replicas: 1
template:
metadata:
labels:
app: cpu-stress
spec:
schedulerName: crane-scheduler
hostNetwork: true
tolerations:
- key: node.kubernetes.io/network-unavailable
operator: Exists
effect: NoSchedule
containers:
- name: stress
image: docker.io/gocrane/stress:latest
command: ["stress", "-c", "1"]
resources:
requests:
memory: "1Gi"
cpu: "1"
limits:
memory: "1Gi"
cpu: "1"
2、pod详情:
Name: cpu-stress-cc8656b6c-b5hhz
Namespace: default
Priority: 0
Node:
Labels: app=cpu-stress
pod-template-hash=cc8656b6c
Annotations:
Status: Pending
IP:
IPs:
Controlled By: ReplicaSet/cpu-stress-cc8656b6c
Containers:
stress:
Image: docker.io/gocrane/stress:latest
Port:
Host Port:
Command:
stress
-c
1
Limits:
cpu: 1
memory: 1Gi
Requests:
cpu: 1
memory: 1Gi
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9nwd5 (ro)
Volumes:
kube-api-access-9nwd5:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: Guaranteed
Node-Selectors:
Tolerations: node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
3、crane-scheduler日志:
I0824 00:50:47.247851 1 serving.go:331] Generated self-signed cert in-memory
W0824 00:50:48.025758 1 options.go:330] Neither --kubeconfig nor --master was specified. Using default API client. This might not work.
W0824 00:50:48.073470 1 authorization.go:47] Authorization is disabled
W0824 00:50:48.073495 1 authentication.go:40] Authentication is disabled
I0824 00:50:48.073517 1 deprecated_insecure_serving.go:51] Serving healthz insecurely on [::]:10251
I0824 00:50:48.080823 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0824 00:50:48.080862 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0824 00:50:48.080915 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0824 00:50:48.080927 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0824 00:50:48.080957 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0824 00:50:48.080968 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0824 00:50:48.081199 1 secure_serving.go:197] Serving securely on [::]:10259
I0824 00:50:48.081270 1 tlsconfig.go:240] Starting DynamicServingCertificateController
W0824 00:50:48.091287 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W0824 00:50:48.146624 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
I0824 00:50:48.182865 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
I0824 00:50:48.183903 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0824 00:50:48.184059 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0824 00:50:48.284088 1 leaderelection.go:243] attempting to acquire leader lease kube-system/kube-scheduler...
W0824 00:57:30.128689 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W0824 01:02:45.130884 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W0824 01:08:48.133483 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W0824 01:14:31.135801 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W0824 01:20:24.138959 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W0824 01:30:10.141873 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
4、crane-scheduler-controlle日志:
I0824 08:46:16.647776 1 server.go:61] Starting Controller version v0.0.0-master+$Format:%H$
I0824 08:46:16.648237 1 leaderelection.go:248] attempting to acquire leader lease crane-system/crane-scheduler-controller...
I0824 08:46:16.706891 1 leaderelection.go:258] successfully acquired lease crane-system/crane-scheduler-controller
I0824 08:46:16.807546 1 controller.go:72] Caches are synced for controller
I0824 08:46:16.807631 1 node.go:46] Start to reconcile node events
I0824 08:46:16.807653 1 event.go:30] Start to reconcile EVENT events
I0824 08:46:16.885698 1 node.go:75] Finished syncing node event "node6/cpu_usage_avg_5m" (77.952416ms)
I0824 08:46:16.973162 1 node.go:75] Finished syncing node event "node4/cpu_usage_avg_5m" (87.371252ms)
I0824 08:46:17.045250 1 node.go:75] Finished syncing node event "master2/cpu_usage_avg_5m" (72.023298ms)
I0824 08:46:17.109260 1 node.go:75] Finished syncing node event "master3/cpu_usage_avg_5m" (63.673389ms)
I0824 08:46:17.192332 1 node.go:75] Finished syncing node event "node1/cpu_usage_avg_5m" (83.005155ms)
I0824 08:46:17.529495 1 node.go:75] Finished syncing node event "node2/cpu_usage_avg_5m" (337.099052ms)
I0824 08:46:17.927163 1 node.go:75] Finished syncing node event "node3/cpu_usage_avg_5m" (397.603044ms)
I0824 08:46:18.327978 1 node.go:75] Finished syncing node event "node5/cpu_usage_avg_5m" (400.749476ms)
I0824 08:46:18.746391 1 node.go:75] Finished syncing node event "master1/cpu_usage_avg_5m" (418.360885ms)
I0824 08:46:19.129081 1 node.go:75] Finished syncing node event "node6/cpu_usage_max_avg_1h" (382.635495ms)
I0824 08:46:19.524508 1 node.go:75] Finished syncing node event "node4/cpu_usage_max_avg_1h" (395.361539ms)
I0824 08:46:19.948035 1 node.go:75] Finished syncing node event "master2/cpu_usage_max_avg_1h" (423.453672ms)
I0824 08:46:20.332014 1 node.go:75] Finished syncing node event "master3/cpu_usage_max_avg_1h" (383.909395ms)
I0824 08:46:20.737296 1 node.go:75] Finished syncing node event "node1/cpu_usage_max_avg_1h" (405.102002ms)
I0824 08:46:21.245055 1 node.go:75] Finished syncing node event "node2/cpu_usage_max_avg_1h" (507.697871ms)
I0824 08:46:21.573490 1 node.go:75] Finished syncing node event "node3/cpu_usage_max_avg_1h" (328.368489ms)
I0824 08:46:21.937814 1 node.go:75] Finished syncing node event "node5/cpu_usage_max_avg_1h" (364.254837ms)
I0824 08:46:22.335988 1 node.go:75] Finished syncing node event "master1/cpu_usage_max_avg_1h" (397.952357ms)
I0824 08:46:22.724851 1 node.go:75] Finished syncing node event "master2/cpu_usage_max_avg_1d" (388.771915ms)
I0824 08:46:23.126059 1 node.go:75] Finished syncing node event "master3/cpu_usage_max_avg_1d" (401.156708ms)
I0824 08:46:23.528329 1 node.go:75] Finished syncing node event "node6/cpu_usage_max_avg_1d" (402.208827ms)
I0824 08:46:23.937560 1 node.go:75] Finished syncing node event "node4/cpu_usage_max_avg_1d" (409.165081ms)
I0824 08:46:24.331730 1 node.go:75] Finished syncing node event "node5/cpu_usage_max_avg_1d" (394.024206ms)
I0824 08:46:24.730137 1 node.go:75] Finished syncing node event "master1/cpu_usage_max_avg_1d" (398.33551ms)
I0824 08:46:25.127074 1 node.go:75] Finished syncing node event "node1/cpu_usage_max_avg_1d" (396.798913ms)
I0824 08:46:25.528844 1 node.go:75] Finished syncing node event "node2/cpu_usage_max_avg_1d" (401.701104ms)
I0824 08:46:25.932684 1 node.go:75] Finished syncing node event "node3/cpu_usage_max_avg_1d" (403.762529ms)
I0824 08:46:26.330458 1 node.go:75] Finished syncing node event "node4/mem_usage_avg_5m" (397.710372ms)
I0824 08:46:26.736576 1 node.go:75] Finished syncing node event "master2/mem_usage_avg_5m" (406.060927ms)
The text was updated successfully, but these errors were encountered: