Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

volume leaks when PVC is deleted when associated PV is in the terminating state #546

Closed
divyenpatel opened this issue Jan 11, 2021 · 19 comments · Fixed by #679 or kubernetes-sigs/sig-storage-lib-external-provisioner#117
Assignees
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@divyenpatel
Copy link
Contributor

Steps

  1. Create PVC and wait for it to get bound.
# kubectl create -f pvc.yaml 
persistentvolumeclaim/example-vanilla-block-pvc created
# kubectl get pvc
NAME                        STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS               AGE
example-vanilla-block-pvc   Bound    pvc-6791fdd4-5fad-438e-a7fb-16410363e3da   5Gi        RWO            example-vanilla-block-sc   19s
# kubectl get pv pvc-6791fdd4-5fad-438e-a7fb-16410363e3da
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                               STORAGECLASS               REASON   AGE
pvc-6791fdd4-5fad-438e-a7fb-16410363e3da   5Gi        RWO            Delete           Bound    default/example-vanilla-block-pvc   example-vanilla-block-sc            23s
  1. Delete PV.
# kubectl delete pv pvc-6791fdd4-5fad-438e-a7fb-16410363e3da
persistentvolume "pvc-6791fdd4-5fad-438e-a7fb-16410363e3da" deleted
^C

PV can not be deleted, so kubectl delete pv command is not giving up terminal prompt. So I typed ctl+c to exit.
Checked PV status. It went into terminating state.

# kubectl get pv pvc-6791fdd4-5fad-438e-a7fb-16410363e3da
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS        CLAIM                               STORAGECLASS               REASON   AGE
pvc-6791fdd4-5fad-438e-a7fb-16410363e3da   5Gi        RWO            Delete           Terminating   default/example-vanilla-block-pvc   example-vanilla-block-sc            2m23s
  1. When PV is in the terminating state, delete PVC.
# kubectl delete pvc example-vanilla-block-pvc
persistentvolumeclaim "example-vanilla-block-pvc" deleted
  1. PV and PVC are both deleted from the system.
# kubectl get pv pvc-6791fdd4-5fad-438e-a7fb-16410363e3da
Error from server (NotFound): persistentvolumes "pvc-6791fdd4-5fad-438e-a7fb-16410363e3da" not found

The controller is not getting calls to delete the volume from the datastore. This results into leaking volume on the datastore.

Above workflow is executed using very latest external-provisioner - v2.1.0. This issue is also present on prior release of the external provisioner.

Filing this bug to discuss about what fix we can make to prevent this orphan volume on the system.

This issue was also discussed here - #195 (comment)

cc: @xing-yang @SandeepPissay

@xing-yang
Copy link
Contributor

The controller is not getting calls to delete the volume from the datastore

@divyenpatel Which controller? You meant the CSI Driver controller?

@pohly
Copy link
Contributor

pohly commented Jan 12, 2021

My take on this: the system works as designed at the moment. Admins should only delete a PV if they know that the underlying volume in the storage system is gone.

The PV protection controller only prevents that the PV object gets removed while there is a PVC that uses the PV. It has no knowledge about the underlying volume.

@divyenpatel
Copy link
Contributor Author

The controller is not getting calls to delete the volume from the datastore

@divyenpatel Which controller? You meant the CSI Driver controller?

Yes

@xing-yang
Copy link
Contributor

The Reclaim Policy on PV is "Delete" here. If volume is not deleted regardless of the Reclaim Policy, what is the difference between "Delete" and "Retain" in this case?

@xing-yang
Copy link
Contributor

Discussed in today's sig-storage meeting.

@wjun
Copy link

wjun commented Feb 19, 2021

IHAC who met this issue in their production environment. My customer provides k8s as a self service to its customers. In this service provider model, whatever change customers make to their clusters through valid k8s CL/APIs, the service provider should not have potential billing issues (they count disk usages into billing).

@xing-yang
Copy link
Contributor

Discussed this with @jsafrane and @msau42. We could introduce an alpha feature in 1.22 and fix the code to always honor the deletion policy, in the same time, deprecate the existing behavior. After one year (4 releases), we can GA the feature and stop supporting the existing behavior.

@xing-yang xing-yang self-assigned this Mar 10, 2021
@xing-yang
Copy link
Contributor

CC @deepakkinni

@deepakkinni
Copy link
Member

/assign

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 12, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 11, 2021
@divyenpatel
Copy link
Contributor Author

/remove-lifecycle stale

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@pohly
Copy link
Contributor

pohly commented Sep 12, 2021

/reopen
/lifecycle frozen

@k8s-ci-robot
Copy link
Contributor

@pohly: Reopened this issue.

In response to this:

/reopen
/lifecycle frozen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Sep 12, 2021
@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Sep 12, 2021
@pohly
Copy link
Contributor

pohly commented Nov 11, 2021

/reopen

Not fixed in external-provisioner yet, only in sig-storage-lib-external-provisioner.

@k8s-ci-robot k8s-ci-robot reopened this Nov 11, 2021
@k8s-ci-robot
Copy link
Contributor

@pohly: Reopened this issue.

In response to this:

/reopen

Not fixed in external-provisioner yet, only in sig-storage-lib-external-provisioner.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@igoratencompass
Copy link

Interesting, thought this was a security feature so you don't delete a PV by accident while still referenced by a PVC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
9 participants