Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vSphere 7.x no applicable volumesnapshotter found #8494

Open
r0k5t4r opened this issue Dec 6, 2024 Discussed in #8481 · 5 comments
Open

vSphere 7.x no applicable volumesnapshotter found #8494

r0k5t4r opened this issue Dec 6, 2024 Discussed in #8481 · 5 comments
Assignees

Comments

@r0k5t4r
Copy link

r0k5t4r commented Dec 6, 2024

Discussed in #8481

Originally posted by r0k5t4r December 4, 2024
Hi all,

I have the following set up:

vSphere 7
VMware ESXi 7.0 Update 3q
Rancher 2.9.2
K8s v1.28.10+rke2r1
Velero 1.15 (Helm)

Helm values:

  backupStorageLocation:
  - bucket: velero
    config:
      checksumAlgorithm: ""
      region: default
      s3ForcePathStyle: true
      s3Url: https://cephrgw.local
    name: default
    provider: aws
  defaultVolumesToFsBackup: false
  uploaderType: kopia
  volumeSnapshotLocation:
  - bucket: velero
    config:
      region: default
    name: default
    provider: aws
credentials:
  secretContents:
    cloud: |
      [default]
      aws_access_key_id = asdasdasdasdds
      aws_secret_access_key = asdasdasdasdasdsd
deployNodeAgent: true
features: EnableCSI
initContainers:
- image: velero/velero-plugin-for-aws:latest
  imagePullPolicy: Always
  name: velero-plugin-for-aws
  volumeMounts:
  - mountPath: /target
    name: plugins
snapshotsEnabled: true

Storage Class:

kubectl get sc
NAME                       PROVISIONER              RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
csisnaps-sc (default)      csi.vsphere.vmware.com   Delete          Immediate           true                   14h
vsphere-csi-sc (default)   csi.vsphere.vmware.com   Delete          Immediate           true                   15h

Restic / Kopia works fine. But shouldn't PV snapshots also work with this set up?

I followed the steps from this blogpost and I can successfully create and restore volume snapshots:

https://cormachogan.com/2022/03/03/announcing-vsphere-csi-driver-v2-5-support-for-csi-snapshots/

When I try to backup the pod and its pv, the backup finished very quick and I see this error in the logs:

time="2024-12-04T06:25:38Z" level=info msg="Summary for skipped PVs: [{\"name\":\"pvc-df83e1cd-673c-454d-9943-c48c6b65fbe3\",\"reasons\":[{\"approach\":\"podvolume\",\"reason\":\"opted out due to annotation in pod csisnaps-pod\"},{\"approach\":\"volumeSnapshot\",\"reason\":\"no applicable volumesnapshotter found\"}]}]" backup=velero/rancher-test-default logSource="pkg/backup/backup.go:542"

I really tried all different sorts of things.

This is my snapotclass. I even added a label.

kubectl get volumesnapshotclass -o yaml
apiVersion: v1
items:
- apiVersion: snapshot.storage.k8s.io/v1
  deletionPolicy: Delete
  driver: csi.vsphere.vmware.com
  kind: VolumeSnapshotClass
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"snapshot.storage.k8s.io/v1","deletionPolicy":"Delete","driver":"csi.vsphere.vmware.com","kind":"VolumeSnapshotClass","metadata":{"annotations":{},"labels":{"velero.io/csi-volumesnapshot-class":"true"},"name":"block-snapshotclass"}}
    creationTimestamp: "2024-12-03T15:49:54Z"
    generation: 1
    labels:
      velero.io/csi-volumesnapshot-class: "true"
    name: block-snapshotclass
    resourceVersion: "20557"
    uid: 2ae336ab-3469-411d-ba05-fe1c81f9f718
kind: List
metadata:
  resourceVersion: ""

I also added the OPT-IN for the pvc:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: csisnaps-pvc-vsan-claim
  annotations:
    backup.velero.io/backup-volumes: "true"
spec:
  storageClassName: vsphere-csi-sc
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi

Either I'm missing something or volume snapshots are not supported in my set up.

Please help.

@Lyndon-Li
Copy link
Contributor

CSI snapshot is replacing native volume snapshotter, so for vSphere, only CSI snapshot is supported.
Additionally, you are recommended to use CSI snapshot+data mover for vSphere env, there is a limit of snapshots preserved locally for each volume (3 for vSphere 7.x) and the VM's performance will be affected if you preserve the local snapshots for long term.

@blackpiglet
Copy link
Contributor

blackpiglet commented Dec 9, 2024

Thanks for providing the detailed information, and sorry for the confusion and inconvenience.

First, if you are a vSphere customer, it's better to ask for help from VCF customer support.

Second, to your issue, we need to know the content of the PV. I mean the YAML of the PV, e.g. kubectl get pv "pvc-name" -o yaml.
It seems you just installed the CSI driver in the vSphere environment to support the Volume and the snapshot function.
It's possible there were still some volumes created before enabling the CSI.
The CSI drivers cannot handle the legacy volume in the CSI way.
For Velero, it fell back to the default method of snapshotting, which is the Velero-Native snapshot. That is a plugin provided by the cloud provider. The plugin is supposed to directly talk to the cloud provider's snapshot API, for example, Velero has the AWS, Azure , and GCP plugins, but the vSphere plugin doesn't work that way.
As a result, the native snapshot failed too.

If there is exactly what happened in your environment, I suggest to use Restic/Kopia to backup the legacy volumes. The new CSI-compatible volumes can be backed up by the CSI driver way.

@r0k5t4r
Copy link
Author

r0k5t4r commented Dec 16, 2024

First, if you are a vSphere customer, it's better to ask for help from VCF customer support.

Thanks for the quick response and sorry for my late response. I'm actually on leave at the moment but I don't want to loose the momentum. I didn't know that we could also troubleshoot this issue through VCF customer support. Thanks for the hint.

Back to the issue. Here is the yaml of the two pvcs:

$ kubectl get pv | grep -i sonarqube
pvc-369cd6c2-95bb-43f0-b97c-12f847e44f28   5Gi         RWO            Delete           Bound    cicd-dev/sonarqube-sonarqube                        vsphere-csi-sc   <unset>                          20d
pvc-4867e606-2ae7-49ac-8f2d-402ae458a4b4   20Gi        RWO            Delete           Bound    cicd-dev/data-sonarqube-postgresql-0                vsphere-csi-sc   <unset>                          20d
$ kubectl get pv "pvc-4867e606-2ae7-49ac-8f2d-402ae458a4b4" -o yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: csi.vsphere.vmware.com
    volume.kubernetes.io/provisioner-deletion-secret-name: ""
    volume.kubernetes.io/provisioner-deletion-secret-namespace: ""
  creationTimestamp: "2024-11-26T11:22:15Z"
  finalizers:
  - kubernetes.io/pv-protection
  - external-attacher/csi-vsphere-vmware-com
  name: pvc-4867e606-2ae7-49ac-8f2d-402ae458a4b4
  resourceVersion: "31579169"
  uid: b0a9fc4e-e0b3-4477-9235-5a30f97cd498
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 20Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: data-sonarqube-postgresql-0
    namespace: cicd-dev
    resourceVersion: "31579015"
    uid: 4867e606-2ae7-49ac-8f2d-402ae458a4b4
  csi:
    driver: csi.vsphere.vmware.com
    fsType: ext4
    volumeAttributes:
      storage.kubernetes.io/csiProvisionerIdentity: 1731312348652-9279-csi.vsphere.vmware.com
      type: vSphere CNS Block Volume
    volumeHandle: 2dd76a49-47ba-469f-b747-0dbfc452fdce
  persistentVolumeReclaimPolicy: Delete
  storageClassName: vsphere-csi-sc
  volumeMode: Filesystem
status:
  lastPhaseTransitionTime: "2024-11-26T11:22:15Z"
  phase: Bound
$ kubectl get pv "pvc-369cd6c2-95bb-43f0-b97c-12f847e44f28" -o yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: csi.vsphere.vmware.com
    volume.kubernetes.io/provisioner-deletion-secret-name: ""
    volume.kubernetes.io/provisioner-deletion-secret-namespace: ""
  creationTimestamp: "2024-11-26T11:22:15Z"
  finalizers:
  - kubernetes.io/pv-protection
  - external-attacher/csi-vsphere-vmware-com
  name: pvc-369cd6c2-95bb-43f0-b97c-12f847e44f28
  resourceVersion: "31579443"
  uid: ecc86e87-b6ed-4d68-965e-795fb964cdb6
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 5Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: sonarqube-sonarqube
    namespace: cicd-dev
    resourceVersion: "31579021"
  uid: ecc86e87-b6ed-4d68-965e-795fb964cdb6
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 5Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: sonarqube-sonarqube
    namespace: cicd-dev
    resourceVersion: "31579021"
    uid: 369cd6c2-95bb-43f0-b97c-12f847e44f28
  csi:
    driver: csi.vsphere.vmware.com
    fsType: ext4
    volumeAttributes:
      storage.kubernetes.io/csiProvisionerIdentity: 1731312348652-9279-csi.vsphere.vmware.com
      type: vSphere CNS Block Volume
    volumeHandle: 56f1c9d2-2828-4485-b850-8efee507c3d0
  persistentVolumeReclaimPolicy: Delete
  storageClassName: vsphere-csi-sc
  volumeMode: Filesystem
status:
  lastPhaseTransitionTime: "2024-11-26T11:22:15Z"
  phase: Bound

Currently I'm again using restic/kopia.

Actually the volumes have been created after I deployed the cluster with Rancher, but maybe it is because they have been restored through velero from a restic backup?

Cheers,
Oliver

@blackpiglet
Copy link
Contributor

File system backup and restore should work in your scenario, and I also think the CSI snapshot and data mover B/R should work for the PVs you posted. They both have CSI section. In normal case, Velero shouldn't choose Velero-native snapshot way.

If you want to debug why the reported error happened, please run the velero debug CLI to collect the debug bundle and upload it here to further investigate this issue.

@r0k5t4r
Copy link
Author

r0k5t4r commented Dec 17, 2024

Thanks for the offer. I feel a bit uncomfortable uploading the debug log here. I think I will rather raise a ticket with VCF customer support.

Cheers,
Oliver

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants