Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore of PV fails when CSI snapshot and Data-Move is in use #7975

Closed
edhunter665 opened this issue Jul 3, 2024 · 7 comments
Closed

Restore of PV fails when CSI snapshot and Data-Move is in use #7975

edhunter665 opened this issue Jul 3, 2024 · 7 comments

Comments

@edhunter665
Copy link

What steps did you take and what happened:
We are using Velero to backup our Kubernetes clusters on vSphere with Tanzu. We are using CSI snapshots to backup our persistent volumes. These backups are moved to MinIO. We are using vSphere CSI driver csi.vsphere.vmware.com
We are using the StorageClasses coming from Supervisor cluster. Backup is working fine in any case.
When bindingMode is set to Immediate we cannot restore from backup when Data-Move is used. It fails because the PVs cannot be restored because of "claim Selector is not supported".
Same restore works when using StorageClass with bindingMode set to WaitForFirstConsumer.

When backup is taken without Data Move or when done file-based the restore also works fine.

What did you expect to happen:

Successful restore when using Data Move and StorageClass with bindingMode Immediate.

StorageClasses:
image

Error on PV creation:
image

Environment:

  • Velero version (use velero version): 1.14
  • Kubernetes version (use kubectl version): 1.27, 1.28

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • 👍 for "I would like to see this bug fixed as soon as possible"
  • 👎 for "There are more important bugs to focus on right now"
@Lyndon-Li
Copy link
Contributor

the PVs cannot be restored because of "claim Selector is not supported".

This is expected as a normal behavior of data mover restore. Therefore, this is not the cause of the restore failure.

@Lyndon-Li
Copy link
Contributor

@edhunter665 In order to further troubleshoot, please share the velero bundle by running velero debug

@Lyndon-Li
Copy link
Contributor

@edhunter665

We are using the StorageClasses coming from Supervisor cluster

Are you using Velero data mover to backup the Supervisor cluster?

@edhunter665
Copy link
Author

@edhunter665 In order to further troubleshoot, please share the velero bundle by running velero debug

bundle-2024-07-03-13-53-29_2.zip

@edhunter665
Copy link
Author

@edhunter665

We are using the StorageClasses coming from Supervisor cluster

Are you using Velero data mover to backup the Supervisor cluster?

No, Velero data mover is used for backups of Tanzu Guest Cluster

@Lyndon-Li
Copy link
Contributor

"message": "found a dataupload velero/restore01-sdcgl with expose error: Pod is unschedulable: 0/4 nodes are available: persistentvolumeclaim \"restore01-sdcgl\" not found. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling... mark it as cancel"

This should be the same as #7898. Please wait for 1.14.1 for the fix. For now, you can use 1.13.x until 1.14.1 is available.

@reasonerjt
Copy link
Contributor

Closing as dup of #7898, it will be fixed in v1.14.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants