This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Stray Clone Left in CephFS Filesystem When Deleting PVC During Cloning Process #4043
Labels
component/cephfs
Issues related to CephFS
Background
We are still struggling with Kasten.io exports and the fact that they use the old RW snapshot clone method which on volumes with any significant size will result in a timeout.
The trouble is not just that the backups aren't working so that we needed to put a workaround in place (which is something for Kasten to solve), but also that whenever the issue happens we get a stray folder on CephFS which eventually eats up our disk space and is difficult and a bit scary to clean up when you don't have a clear reference where it comes from.
We blame it on the issue below and I'd love to hear if anyone else has experienced this. If there's interest to work on this and you need to me to reproduce it manually, I'm happy to share details.
Issue
When cloning a CephFS snapshot, if the PVC is deleted while the cloning process is still ongoing, a stray clone remains in the CephFS filesystem.
Affected Versions
Tested on ceph-csi 3.8
Steps to Reproduce
Initiate a clone from a snapshot of significant size in traditional RW mode.
Before the cloning process completes and the volume becomes available, delete the PVC.
Expected Behavior
The cloning process should either be interrupted and the clone should be removed from CephFS, or there should be a reference retained in Kubernetes.
Actual Behavior
The cloning process continues uninterrupted, resulting in a new folder appearing on CephFS. However, there is no reference to this folder in Kubernetes, neither as a PV nor a PVC.
The text was updated successfully, but these errors were encountered: