Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add design for VolumeGroupSnapshot support #8778

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

shubham-pampattiwar
Copy link
Collaborator

Thank you for contributing to Velero!

Please add a summary of your change

Add design for VolumeGroupSnapshot support

This design extends Velero to support grouped snapshots using the Kubernetes VolumeGroupSnapshot (VGS) API. It allows users to group PVCs by specifying a label key, ensuring that all related PVCs are snapshotted together for data consistency. The design updates the existing backup/restore workflows to handle VGS objects, and updates/adds new backup and restore plugins as well.

Does your change fix a particular issue?

Design for #8447

Please indicate you've done the following:

Copy link

codecov bot commented Mar 12, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 59.55%. Comparing base (bea46e3) to head (e9f23a3).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8778      +/-   ##
==========================================
+ Coverage   59.54%   59.55%   +0.01%     
==========================================
  Files         370      370              
  Lines       40239    40239              
==========================================
+ Hits        23961    23966       +5     
+ Misses      14778    14773       -5     
  Partials     1500     1500              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

![VGS-Backup-Sequence](VGS-Backup-Sequence.png)


Restore workflow:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, this is not required by data mover.
Data mover's Generic Restore process is neutral to how the backup goes.

Let's clarify this.

@Lyndon-Li
Copy link
Contributor

@shubham-pampattiwar
I made some comments.
I think the major gap is the data mover part, others basically look good to me.

I will also think more about this part later.

- The VGS BIA plugin will add the related VS as additional items later.
- For datamover case:
- We need to bypass creation of VS but need to create DataUploads for the VS instance that got created due to VGS creation.
- Fetch the correct VS instances that got created via VGS and is related to the current PVC.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add: wait for appearance of snapshotHandle

- Create VS and follow the existing legacy workflow.


#### Add a new VolumeGroupSnapshot(VGS) BIA plugin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I further considered this, and I think we don't need to backup the VGS/VGSC/VGSClass objects at all, we only need to create VGS to make sure the snapshots are created at the same time, and after that, the will be out of use, especially, during restore.

During restore, we will recreate the VS/VSC, and the most essential information is the snapshot handle. We don't need VGS/VGSC/VGSClass objects.

Copy link
Contributor

@blackpiglet blackpiglet Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to set up a VolumeGroupSnapshot-enabled environment, but I haven't managed to do that in a GKE cluster yet. I will continue to do that later.

As a result, I'm still not sure how the VolumeGroupSnapshot works.
If the VolumeGroupSnapshot feature is enabled in a cluster, are the VolumeSnapshot-related resources still needed to work?
A concrete scenario is, say the user created a VolumeGroupSnapshot, and it referenced an existing VolumeGroupSnapshotClass, then the VolumeSnapshots should be created accordingly, do the VolumeSnapshots still need to reference a VolumeSnapshotClass, or the VolumeSnapshotClass parameter can be nil for the VolumeSnapshots?

This difference is related to how we should handle the VolumeGroupSnapshotClass, if the VolumeSnapshotClass is used for the VolumeSnapshot and VolumeSnapshotContent creation, then we don't need the VolumeGroupSnapshotClass, if not, then VolumeGroupSnapshotClass is still needed in restore.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blackpiglet for restore we can directly use the VG as a source. https://kubernetes.io/blog/2024/12/18/kubernetes-1-32-volume-group-snapshot-beta/#how-to-use-group-snapshot-for-restore-in-kubernetes
@Lyndon-Li So use VGS only for co-ordinating VS/VSC creation in the backup workflow, thats it ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So use VGS only for co-ordinating VS/VSC creation in the backup workflow, thats it ?

Yes, correct

Copy link
Contributor

@Lyndon-Li Lyndon-Li Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for restore we can directly use the VG as a source.

What is the benefit to do this in restore to Velero/Velero users?
As far as I can come out, it simplifies users' operation when there are many volumes in the VG. However, as a backup solution Velero has already automatically managed volumes->volume snapshots->volume snapshot restores. So this won't help Velero.

In the other way, if we get rid of VGS in restore, we will not need to restore VGS objects to users' env, which otherwise may make chaos and confuse users when an accident happens.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shubham-pampattiwar "for restore we can directly use the VG as a source." -- I think you meant VS right?

We don't have much choice here. The API doesn't allow for using the VGS for restore as I understand it -- VGS facilitates creating individual VolumeSnapshots, but when restoring, there's no choice but to individually create PVCs with a VolumeSnapshot source.

Copy link
Contributor

@blackpiglet blackpiglet Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shubham-pampattiwar
I remember the external-snapshotter requires VolumeSnapshot or the VolumeSnapshotContent's referencing VolumeSnapshotClass existing the cluster. That's why the CSI plugin syncs the VolumeSnapshotClass alongside the CSI backup.

@sseago
Copy link
Collaborator

sseago commented Mar 24, 2025

Oh, I see now. I think the "PVC in multiple groups" confusion comes from one of the examples trying to do that. But since we're using a single label to identify group membership, this isn't possible, since you can't have 2 labels on a resource with the same key. We probably just need to update the example to have each PVC having only one label or another.

Since the whole point of volume groups is "these volumes should always be snapshotted together", having a PVC in multiple groups doesn't really make sense anyway. If you need more PVCs together, you'd just put them all in the same group.

@shubham-pampattiwar
Copy link
Collaborator Author

@Lyndon-Li @sseago @blackpiglet Updated the PR to reflect the latest changes from the PR comments/discussions. PTAL, Thank you !

along with the label used for the VGS (we will be creating VGS with backup UUID and user specified VGS group label) and see if the PVC is part of any of the existing VGS for the particular backup operation we are currently in.
- If not then create the VGS object with the required labels (in this case backup UUID and the user specified VGS group label) and also add the PVC names (that are included in the VGS) as
annotations.
- Once VGS is created, wait for VGS object to be `readyToUse`, it implies that all the related VS/VSC objects are ready to use too.
Copy link
Contributor

@Lyndon-Li Lyndon-Li Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed previously, we don't wait the VS/VSC to get readyToUse in the plugin, so we cannot wait for VGS object to be readyToUse here.

Maybe we just wait for appearance of the VS/VSC, then we can deliver the processing to VS/VSC action.

The diagram also needs to be modified.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, updated the PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shubham-pampattiwar I see the statement of "Once VGS is created, wait for VGS object to be `readyToUse" is still there, could you double check?

- Add the DataUpload as an additional item.
- PVC has VGS label and VGS already exists:
- In this case we do not want to re-create VGS object.
- The required VS/DU for the PVC already exists as VGS instance exists for the PVC and backup UUID.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add: wait for appearance of snapshotHandle

We still need to wait for the VSC in the right status before moving forwad

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats already present in current DM flow of CSI plugin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That depends on how we are going to delete the VGS/VGSC, if we do it in the PVC BIA, this is a prerequisite for the next step --- delete the VGS/VGSC. So we must clarify it in the steps.
See comment #8778 (comment)

- Create VS and follow the existing legacy workflow.

#### Add a new VolumeGroupSnapshot(VGS) BIA plugin
- Cleanup the VGS object, here we need to be careful, we want to delete VGS and VGSC objects but keep the VS and VSC objects for further processing of our backup.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must wait for VS/VSC are provisioned before deleting the VGS/VGSC.
But I am not sure what does "provisioned" mean exactly, e.g., is it safe to delete VGS/VGSC immediately after we see the VS/VSC objects?

We may need a POC here.

flowchart TD
subgraph ItemBlock
P1[Pod1]
PVC1[PVC1 group: A]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one nuance I could not find in the doc.
Grouping can only be done at driver type.
Say PVC1,2,3 are all in Group A
But PVC 1,2 are driver D1 and 3 is driver D2

We can't group these 3 PVCs under one VGS.
As per my understanding each driver needs to impl the VGS API.
Or is VGS agnostic of driver and will call into different drivers for different snapshots.

This has further design implication IMO.

Let me know if this is already covered somewhere and I missed it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anshulahuja98 yes I think you are correct, you cannot mix and match PVCs for a particular VGS instance. It depends on the VGSClass specified in the VGS object spec.
Shall I just add a note regarding this caveat in the design ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might require a design enhancement instead of just a note.

you can take this 2 ways, 1 way is to assume customer grouped the volumes properly per each driver and if you encounter different drivers you fail the VGS and fallback to normal 1 snapshot at a time.

2nd way is during VGS creation create different VGS for each driver. For Group A, create 2 VGS.

Open to other approaches.

Sideline -
We should also add a "note" for scenarios where all drivers don't support VGS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think velero should create VGS per driver, and label it such that more than one VGS can be listed by velero for a PVC group for restore.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaovilai so if 4 PVCs had the same label, 2 for driver1 and 2 for driver2, we'd have two different VGS created? I wonder how that works on the VGS side -- if you have 2 VGS created with the same label selector. I'm guessing they'd each have a different VolumeGroupSnapshotClass? And does the VGS controller automatically filter on driver when creating the volume snapshots?

Signed-off-by: Shubham Pampattiwar <[email protected]>

add changelog file

Signed-off-by: Shubham Pampattiwar <[email protected]>

fix codespell checks

Signed-off-by: Shubham Pampattiwar <[email protected]>

address PR feedback: add itemblock:VGS digrams and extra notes for clarification

Signed-off-by: Shubham Pampattiwar <[email protected]>

update backup workflow

Signed-off-by: Shubham Pampattiwar <[email protected]>
Signed-off-by: Shubham Pampattiwar <[email protected]>
Signed-off-by: Shubham Pampattiwar <[email protected]>
Signed-off-by: Shubham Pampattiwar <[email protected]>
Signed-off-by: Shubham Pampattiwar <[email protected]>
@shubham-pampattiwar
Copy link
Collaborator Author

shubham-pampattiwar commented Mar 28, 2025

@Lyndon-Li @blackpiglet @sseago I updated the PR with following changes:

  • Add VGS object to itemToUpdate instead of additionalItems, so that VGS BIA executes in backup finalize phase
  • Add VGS BIA which takes care of all the VGS related objects cleanup
  • Add VGA RIA to skip VGS object restore
  • Modify the CleanupVolumeSnapshot func to skip VS deletion that are cerated via VGS object as the cleanup is delegated to VGS BIA for such objects.

Signed-off-by: Shubham Pampattiwar <[email protected]>
along with the label used for the VGS (we will be creating VGS with backup UUID and user specified VGS group label) and see if the PVC is part of any of the existing VGS for the particular backup operation we are currently in.
- If not then create the VGS object with the required labels (in this case backup UUID and the user specified VGS group label).
- Once VGS is created, we just wait for the related VS/VSC objects to exist and not wait for them to be `readyToUse`.
- Add the VGS object to the list of itemToUpdate, so that its BIA plugin runs in finalize backup phase. (we will run all the cleanup tasks in this plugin)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the previous agreements, the VGS/VGSC will be deleted in the PVC BIA since we will not use it after the VS/VSC is created?
Why does this change happen? cc @sseago

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with deleting the VGS in the PVC BIA is that this means we end up deleting it before the other PVC BIAs for the same volume group even run -- this means we end up deleting the other volume snapshots before they get backed up (for non-datamover) or before data movement occurs. I think this will end up breaking things. If we instead delete the VGS during finalize phase (and allow this to cascade deletion of related volume snapshots), then everything should just work

- Add the DataUpload as an additional item.
- PVC has VGS label and VGS already exists:
- In this case we do not want to re-create VGS object.
- The required VS/DU for the PVC already exists as VGS instance exists for the PVC and backup UUID.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That depends on how we are going to delete the VGS/VGSC, if we do it in the PVC BIA, this is a prerequisite for the next step --- delete the VGS/VGSC. So we must clarify it in the steps.
See comment #8778 (comment)

- We need to modify this function skip deletion of VS objects that were created via VGS objects.

#### Add a new VolumeGroupSnapshot(VGS) BIA plugin
- This plugin will run in backup finalize phase.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we are introducing a special BIA plugin that runs only in finalizing phase. This makes the workflow complex.
This comes back to the first question --- can we delete the VGS/VGSC in PVC BIA at the first place? If we can do that, we will keep the followed logics as is without needing to introduce this complexity.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we delete it in the PVC BIA, then we delete it during the first PVC backup in the set of PVCs in the volume group. This would end up deleting the other volume snapshots before we processed them.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if we delete the VGS during the first PVC backup, then when the second PVC in the volume group is backed up, the plugin will not find a VGS, so it will create another. If we have 3 PVCs in the VG, we'll end up creating 3 VGSs and also 3 VSs for each PVC.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a single VGS delete in finalize seems like the simplest way to handle this, since the only logic change required is that we don't delete the VS earlier if it belongs to a VGS, and then the single finalize VGS delete will delete VGS and related VS resources with a single call. This avoids the complexity of working through the interdependencies of the VGS, VG, and PVC resources -- specifically, we need to delete VGS before deleting VS, but we can't delete VGS before all PVCs in the VG are handled -- which means we can't delete it in the same BIA execution that creates the VGS. Doing it during finalize in a single step seems by far the simplest way to do it in terms of workflow logic.

@Lyndon-Li
Copy link
Contributor

Lyndon-Li commented Mar 28, 2025

@Lyndon-Li @blackpiglet @sseago I updated the PR with following changes:

  • Add VGS object to itemToUpdate instead of additionalItems, so that VGS BIA executes in backup finalize phase
  • Add VGS BIA which takes care of all the VGS related objects cleanup
  • Add VGA RIA to skip VGS object restore
  • Modify the CleanupVolumeSnapshot func to skip VS deletion that are cerated via VGS object as the cleanup is delegated to VGS BIA for such objects.

@shubham-pampattiwar
I am somehow lost with these new changes. I don't understand why we delegate the cleanup in the finalizing phase, instead of doing it in place in PVC BIA.
I think this will make the entire workflow much more complex, so please clarify what you've found since the last agreement.

@shubham-pampattiwar
Copy link
Collaborator Author

@Lyndon-Li If we delete VGS then it will delete the related VS instances too (And we need them). Thats why we need to postpone the VGS cleanup in finalize phase via VGS BIA.

@Lyndon-Li
Copy link
Contributor

@Lyndon-Li If we delete VGS then it will delete the related VS instances too (And we need them). Thats why we need to postpone the VGS cleanup in finalize phase via VGS BIA.

As far as I understand from the previous discussion, it is feasible that we set the VGSC as Retained and then delete VGS/VGSC without affecting the VS/VSC. Is there any problem for this way?

@shubham-pampattiwar
Copy link
Collaborator Author

@Lyndon-Li If we delete VGS then it will delete the related VS instances too (And we need them). Thats why we need to postpone the VGS cleanup in finalize phase via VGS BIA.

As far as I understand from the previous discussion, it is feasible that we set the VGSC as Retained and then delete VGS/VGSC without affecting the VS/VSC. Is there any problem for this way?

Yes deletionPolicy does not affect the VGS to VS relationship. If you delete VGS then the realted VS also get deleted. The VS instances that get created via VGS have owner refs on them, so deleting VGS, deletes the VS objects too irrespective of the VGSC deletionPolicy. The deletionPolicy only affects the VGSC to VSC relationship afaict.

Do you see any problems with the proposed changes in the latest design update ?

- Create VS and follow the existing legacy workflow.

#### Modify the CleanupVolumeSnapshot function to skip deletion of VGS related VS
- We use a common cleanupVolumeSnapshot function to cleanup VS objects in both datamover and non-datamover cases
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VS/VSC will be deleted during the data mover execution.
For non data mover, we will have two places to cleanup the VS, one in VGS BIA caused by deletion of VGS, the other is the existing code in VS BIA.

We should resolve these conflicts.

@Lyndon-Li
Copy link
Contributor

The VS instances that get created via VGS have owner refs on them

Then I think we should remove the ownerRef from the VS.

As we discussed previously, the best way is to let the VS/VSC dependent as soon as possible, so that all the followed workflows could be kept as is.
Otherwise, we will need to take care of VGS/VGSC in the whole workflow and solve conflicts as one of the examples in #8778 (comment). This will make the workflow much more complex.

@shubham-pampattiwar
Copy link
Collaborator Author

The VS instances that get created via VGS have owner refs on them

Then I think we should remove the ownerRef from the VS.

As we discussed previously, the best way is to let the VS/VSC dependent as soon as possible, so that all the followed workflows could be kept as is. Otherwise, we will need to take care of VGS/VGSC in the whole workflow and solve conflicts as one of the examples in #8778 (comment). This will make the workflow much more complex.

@Lyndon-Li ok lets say we remove ownerRefs from VS objects and delete VGS object, VS remains in this case, but the individual VS cleanup will get blocked/impacted because of the presence of VGS finalizers on the VS objects.

@sseago
Copy link
Collaborator

sseago commented Mar 28, 2025

@Lyndon-Li @blackpiglet @sseago I updated the PR with following changes:

  • Add VGS object to itemToUpdate instead of additionalItems, so that VGS BIA executes in backup finalize phase
  • Add VGS BIA which takes care of all the VGS related objects cleanup
  • Add VGA RIA to skip VGS object restore
  • Modify the CleanupVolumeSnapshot func to skip VS deletion that are cerated via VGS object as the cleanup is delegated to VGS BIA for such objects.

@shubham-pampattiwar I am somehow lost with these new changes. I don't understand why we delegate the cleanup in the finalizing phase, instead of doing it in place in PVC BIA. I think this will make the entire workflow much more complex, so please clarify what you've found since the last agreement.

@Lyndon-Li This is because we need VGS to stick around for all PVC BIA executions for the PVCs in the group. If you have 3 PVCs in the VG -- pvc1, pvc2, pvc3:

  • pvc1 BIA runs; sees group label, queries for VGS, it doesn't exist, so VGS is created.
  • pvc2 BIA runs; sees group label, queries for VGS -- it's found, so no need to create VGS
  • pvc3 BIA runs; sees group label, queries for VGS -- it's found, so no need to create VGS

If we delete VGS during pvc1 BIA, then when we get to pvc2 and pvc3 we'll end up creating a new VGS.

Trying to add logic to the PVC BIA to figure out which one is the last to be processed is going to be a lot more complex than simply deferring this until finalize.

@Lyndon-Li
Copy link
Contributor

@sseago

If we delete VGS during pvc1 BIA, then when we get to pvc2 and pvc3 we'll end up creating a new VGS.

Trying to add logic to the PVC BIA to figure out which one is the last to be processed is going to be a lot more complex than simply deferring this until finalize.

We must wait for the "provision" of all the VS/VSC in the specific VGS, before deleting the VGS/VGSC. See this comment #8778 (comment).
That is, if BIA of PVC1 creates the VGS, it needs to wait all the VS/VSC to be provisioned in place before entering the async operation; we don't need to have every BIA of PVC doing this complexity.
It is not symmetrical when creating the VGS among PVC BIAs, it doesn't have to be symmetrical when deleting the VGS/VGSC.

I have discussed with @blackpiglet, having a finalizing BIA and deleting the VGS/VGSC there will break many existing logics for both data mover and non data mover even in non group snapshot case. We should avoid this as possible as we can.

@sseago
Copy link
Collaborator

sseago commented Mar 31, 2025

@Lyndon-Li But if VGS and all VS have already been deleted before pbv2 BIA runs, it will find no VGS present and assume that one has not yet been created, so it will create one. How would we prevent this if we delete VGS and associated VS during the first PVC action?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants