Restore of PV fails when CSI snapshot and Data-Move is in use #7975

edhunter665 · 2024-07-03T08:10:21Z

What steps did you take and what happened:
We are using Velero to backup our Kubernetes clusters on vSphere with Tanzu. We are using CSI snapshots to backup our persistent volumes. These backups are moved to MinIO. We are using vSphere CSI driver csi.vsphere.vmware.com
We are using the StorageClasses coming from Supervisor cluster. Backup is working fine in any case.
When bindingMode is set to Immediate we cannot restore from backup when Data-Move is used. It fails because the PVs cannot be restored because of "claim Selector is not supported".
Same restore works when using StorageClass with bindingMode set to WaitForFirstConsumer.

When backup is taken without Data Move or when done file-based the restore also works fine.

What did you expect to happen:

Successful restore when using Data Move and StorageClass with bindingMode Immediate.

StorageClasses:

Error on PV creation:

Environment:

Velero version (use velero version): 1.14
Kubernetes version (use kubectl version): 1.27, 1.28

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

👍 for "I would like to see this bug fixed as soon as possible"
👎 for "There are more important bugs to focus on right now"

The text was updated successfully, but these errors were encountered:

Lyndon-Li · 2024-07-03T10:39:25Z

the PVs cannot be restored because of "claim Selector is not supported".

This is expected as a normal behavior of data mover restore. Therefore, this is not the cause of the restore failure.

Lyndon-Li · 2024-07-03T10:40:02Z

@edhunter665 In order to further troubleshoot, please share the velero bundle by running velero debug

Lyndon-Li · 2024-07-03T10:44:24Z

@edhunter665

We are using the StorageClasses coming from Supervisor cluster

Are you using Velero data mover to backup the Supervisor cluster?

edhunter665 · 2024-07-03T12:15:58Z

@edhunter665 In order to further troubleshoot, please share the velero bundle by running velero debug

bundle-2024-07-03-13-53-29_2.zip

edhunter665 · 2024-07-03T12:17:06Z

@edhunter665

We are using the StorageClasses coming from Supervisor cluster

Are you using Velero data mover to backup the Supervisor cluster?

No, Velero data mover is used for backups of Tanzu Guest Cluster

Lyndon-Li · 2024-07-04T00:46:21Z

"message": "found a dataupload velero/restore01-sdcgl with expose error: Pod is unschedulable: 0/4 nodes are available: persistentvolumeclaim \"restore01-sdcgl\" not found. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling... mark it as cancel"

This should be the same as #7898. Please wait for 1.14.1 for the fix. For now, you can use 1.13.x until 1.14.1 is available.

reasonerjt · 2024-07-08T06:12:58Z

Closing as dup of #7898, it will be fixed in v1.14.1

Lyndon-Li added Area/Cloud/vSphere area/datamover labels Jul 3, 2024

reasonerjt assigned Lyndon-Li Jul 5, 2024

reasonerjt closed this as completed Jul 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restore of PV fails when CSI snapshot and Data-Move is in use #7975

Restore of PV fails when CSI snapshot and Data-Move is in use #7975

edhunter665 commented Jul 3, 2024

Lyndon-Li commented Jul 3, 2024

Lyndon-Li commented Jul 3, 2024

Lyndon-Li commented Jul 3, 2024

edhunter665 commented Jul 3, 2024

edhunter665 commented Jul 3, 2024

Lyndon-Li commented Jul 4, 2024

reasonerjt commented Jul 8, 2024

Restore of PV fails when CSI snapshot and Data-Move is in use #7975

Restore of PV fails when CSI snapshot and Data-Move is in use #7975

Comments

edhunter665 commented Jul 3, 2024

Lyndon-Li commented Jul 3, 2024

Lyndon-Li commented Jul 3, 2024

Lyndon-Li commented Jul 3, 2024

edhunter665 commented Jul 3, 2024

edhunter665 commented Jul 3, 2024

Lyndon-Li commented Jul 4, 2024

reasonerjt commented Jul 8, 2024