Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm function failed: unable to create ClusterRoleBinding: client rate limiter Wait returned an error: context deadline exceeded #678

Closed
kvaps opened this issue Jan 28, 2025 · 7 comments

Comments

@kvaps
Copy link
Contributor

kvaps commented Jan 28, 2025

Just noticed that kamaji stopped processing new clusters, readiness probe reported ok, logs are looping over:

2025-01-28T19:19:59Z	INFO	soot_tenant-k8s2user19549_kubernetes-k8s2user19549.PhaseClusterAdminRBAC	reconciliation completed
2025-01-28T19:20:31Z	ERROR	soot_tenant-k8suser16735_kubernetes-1k8s16735	kubeadm function failed	{"controller": "clusterrolebinding", "controllerGroup": "rbac.authorization.k8s.io", "controllerKind": "ClusterRoleBinding", "ClusterRoleBinding": {"name":"kubeadm:get-nodes"}, "namespace": "", "name": "kubeadm:get-nodes", "reconcileID": "b7e33a3d-b36f-428b-94a8-05ba12a53383", "resource": "PhaseClusterAdminRBAC", "phase": "PhaseClusterAdminRBAC", "error": "unable to create ClusterRoleBinding: client rate limiter Wait returned an error: context deadline exceeded", "errorVerbose": "client rate limiter Wait returned an error: context deadline exceeded\nunable to create ClusterRoleBinding\nk8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.EnsureAdminClusterRoleBindingImpl.func1\n\t/go/pkg/mod/k8s.io/[email protected]/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:714\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func2\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/loop.go:87\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/loop.go:88\nk8s.io/apimachinery/pkg/util/wait.PollUntilContextTimeout\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/poll.go:48\nk8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.EnsureAdminClusterRoleBindingImpl\n\t/go/pkg/mod/k8s.io/[email protected]/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:692\nk8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.EnsureAdminClusterRoleBinding\n\t/go/pkg/mod/k8s.io/[email protected]/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:652\ngithub.com/clastix/kamaji/internal/resources.(*KubeadmPhase).GetKubeadmFunction.func2\n\t/workspace/internal/resources/kubeadm_phases.go:167\ngithub.com/clastix/kamaji/internal/resources.KubeadmPhaseCreate\n\t/workspace/internal/resources/kubeadm_utils.go:151\ngithub.com/clastix/kamaji/internal/resources.(*KubeadmPhase).CreateOrUpdate\n\t/workspace/internal/resources/kubeadm_phases.go:241\ngithub.com/clastix/kamaji/internal/resources.createOrUpdate\n\t/workspace/internal/resources/resource.go:92\ngithub.com/clastix/kamaji/internal/resources.Handle\n\t/workspace/internal/resources/resource.go:67\ngithub.com/clastix/kamaji/controllers/soot/controllers.(*KubeadmPhase).Reconcile\n\t/workspace/controllers/soot/controllers/kubeadm_phase.go:42\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:116\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:303\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:224\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
github.com/clastix/kamaji/internal/resources.KubeadmPhaseCreate
	/workspace/internal/resources/kubeadm_utils.go:152
github.com/clastix/kamaji/internal/resources.(*KubeadmPhase).CreateOrUpdate
	/workspace/internal/resources/kubeadm_phases.go:241
github.com/clastix/kamaji/internal/resources.createOrUpdate
	/workspace/internal/resources/resource.go:92
github.com/clastix/kamaji/internal/resources.Handle
	/workspace/internal/resources/resource.go:67
github.com/clastix/kamaji/controllers/soot/controllers.(*KubeadmPhase).Reconcile
	/workspace/controllers/soot/controllers/kubeadm_phase.go:42
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:116
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:303
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:224

@prometherion
Copy link
Member

I saw a similar issue when the connectivity between Kamaji Operaror and Tenant Control Planes was broken.

I know it could sound silly but could you check connectivity works as expected with the admin.svc Kubeconfig key from the Kamaji Operator pod?

Furthermore, which version or git SHA commit are you running?

@kvaps
Copy link
Contributor Author

kvaps commented Jan 28, 2025

Hey @prometherion thanks for quick reply

I just found that kubernetes api-server seems was stuck.
From the logs it was trying to reach etcd, but didn't succeed for long time.

Despite the fact etcd was running, api-server was not working. So it seems Kubernetes API-server issue.

@kvaps
Copy link
Contributor Author

kvaps commented Jan 28, 2025

I just restarted tenant Kubernetes control-plane and everything started working

@kvaps
Copy link
Contributor Author

kvaps commented Jan 28, 2025

My bad, now I can see the same errors. I investigating it right now

Kamaji was built from edge-24.9.2 using this dockerfile:

https://github.com/aenix-io/cozystack/blob/e23286a336cf057f7c564c5938b61ae6e059a8ef/packages/system/kamaji/images/kamaji/Dockerfile#L4

@kvaps
Copy link
Contributor Author

kvaps commented Jan 28, 2025

UPD: found it! Cluster was working with old hostname.

There were correct ingress resource, but super-admin.conf kubeconfig was containing old hostname, also api-server-certificate was containing old hostname for cert certificates.

Fixed that by removing admin-kubeconfig and api-server-certificate and control-plane pods

It seems kamaji should also check kubeconfig files to contain correct hostname as part of #641 improovment

@prometherion
Copy link
Member

Yes, you're right, we fixed this with latest releases.

If everything's fixed, may I ask you to close this?

@kvaps
Copy link
Contributor Author

kvaps commented Jan 28, 2025

Sure, let's do that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants