Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] pod crashed when creating init pod, new pod always fail because the init pod already exists #225

Open
styshoo opened this issue Dec 24, 2024 · 0 comments

Comments

@styshoo
Copy link

styshoo commented Dec 24, 2024

Describe the bug: A clear and concise description of what the bug is.
dynamic-localpv-provisioner pod crashed(for example: crashed because of renewing lease failed) when creating init pod, new provisioner pod starts, and it continues creating volume. But it always fails because the init pod already exists. So the pvc will always be pending status.

dynamic-localpv-provisioner pods status are as following, one pod restarted.

# kubectl get pod -A | grep openebs
kube-system   openebs-hostpath-localpv-provisioner-7f8d6c886-fm6qt       1/1     Running     1             12m
kube-system   openebs-hostpath-localpv-provisioner-7f8d6c886-cjhpk       1/1     Running     0             12m

The restarted pod's log is as following, it failed to renew lease, then the pod restarted.

I1218 02:42:43.386406       1 leaderelection.go:283] failed to renew lease kube-system/openebs.io-local: timed out waiting for the condition
F1218 02:42:43.386436       1 controller.go:889] leaderelection lost
E1218 02:42:43.386611       1 helper_hostpath.go:377] unable to delete the helper pod: client rate limiter Wait returned an error: context canceled
I1218 02:42:46.811347       1 provisioner_hostpath.go:92] Initialize volume pvc-a1f60f52-c3bd-45d6-a708-5a14be2520cc failed: Get "https://172.16.0.1:443/api/v1/namespaces/kube-system/pods/init-pvc-a1f60f52-c3bd-45d6-a708-5a14be2520cc": context canceled

The other dynamic-localpv-provisioner pod became the leader, but it failed to continue creating init pod.

W1218 02:43:03.897377       1 controller.go:937] Retrying syncing claim "a1f60f52-c3bd-45d6-a708-5a14be2520cc" because failures 0 < threshold 15
I1218 02:43:03.897476       1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"kubegien-logging-system", Name:"elasticsearch-data-elasticsearch-data-2", UID:"a1f60f52-c3bd-45d6-a708-5a14be2520cc", APIVersion:"v1", ResourceVersion:"6470", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "local-path": pods "init-pvc-a1f60f52-c3bd-45d6-a708-5a14be2520cc" already exists
E1218 02:43:03.897782       1 controller.go:957] error syncing claim "a1f60f52-c3bd-45d6-a708-5a14be2520cc": failed to provision volume with StorageClass "local-path": pods "init-pvc-a1f60f52-c3bd-45d6-a708-5a14be2520cc" already exists

Expected behaviour: A concise description of what you expected to happen
After new provisioner pod starts, it can finish creating volume without mistakes.

Steps to reproduce the bug:
Steps to reproduce the bug should be clear and easily reproducible to help people gain an understanding of the problem

  1. Creating a lot of OpenEBS HostPath pvcs.
  2. Kill the dynamic-localpv-provisioner leader pod when it's creating init pods.
  3. The other dynamic-localpv-provisioner pod becomes new leader.
  4. If there are some init pods exist, the new leader dynamic-localpv-provisioner pod can't finish the corrspoding volume creating.

The output of the following commands will help us better understand what's going on:

  • kubectl get pods -n <openebs_namespace> --show-labels
  • kubectl logs <upgrade_job_pod> -n <openebs_namespace>

Anything else we need to know?:
Add any other context about the problem here.

Environment details:

  • OpenEBS version (use kubectl get po -n openebs --show-labels): openebs.io/component-name=openebs-localpv-provisioner,openebs.io/version=4.1.1
  • Kubernetes version (use kubectl version): v1.28.13
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • kernel (e.g: uname -a):
  • others:
@styshoo styshoo changed the title pod crashed when creating init pod, new pod always fail because the init pod already exists [bug] pod crashed when creating init pod, new pod always fail because the init pod already exists Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant