Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore VolumeSnapshot, VolumeSnapshotcontent and dataupload associated with a PVC during a CSI Restore even if it is excluded from the restore due to label(s) selector #8796

Closed

Conversation

amastbau
Copy link

@amastbau amastbau commented Mar 16, 2025

Thank you for contributing to Velero!

Please add a summary of your change

Issue

During a CSI restore, if the VolumeSnapshot is missing for any reason, the PVC remains in a pending state with the error:

error getting snapshot your-snapshot-name from api server: volumesnapshots.snapshot.storage.k8s.io "your-snapshot-name" not found.

In Addition,
The CSI PVC Restore Action fails because restore size is required from the VolumeSnapshot resource to patch the PVC's request size if too small.

To handle cases , in which the volumesnapshot was initially excluded due to label(s) selector,
PVC Restore Action will now return the required VolumeSnapshot in additional items if not found. We annotate the PVC (if not already annotated) and return both the VolumeSnapshot and the PVC in additional items. This allows Velero to restore the VolumeSnapshot and retry the PVC action. If the VolumeSnapshot is missing and the PVC already has the annotation, the restore action will fail.

Note: this only handles cases where the VS is excluded sue to label(s) selector, other cases (like excluding by resource type) are not handled.

Does your change fix a particular issue?

Fixes #(issue)

Please indicate you've done the following:

… associated VolumeSnapshot resource is missing (because it did not match the label selector), PVC Action fails because restore size is required from the VolumeSnapshot resources.

To fix we now return the required VolumeSnapshot in additional items (only if not found). We annotate the PVC (if not already annotated) and return both the VolumeSnapshot and the PVC in additional items. This allows Velero to restore the VolumeSnapshot and retry the PVC action. If the VolumeSnapshot is missing and the PVC already has the annotation, the restore action will fail.
@amastbau amastbau changed the title When restoring a PVC with CSI snapshots using label selectors, if the… When restoring a PVC with CSI snapshots using label selectors Mar 16, 2025
@amastbau amastbau changed the title When restoring a PVC with CSI snapshots using label selectors Restore VolumeSnapshot associated with a PVC even if it was previously excluded due to label(s) selector Mar 16, 2025
@amastbau amastbau changed the title Restore VolumeSnapshot associated with a PVC even if it was previously excluded due to label(s) selector Restore VolumeSnapshot associated with a PVC during a CSI Restore even if it was previously excluded due to label(s) selector Mar 16, 2025
@reasonerjt
Copy link
Contributor

@amastbau Thanks for the contribution.

However, I'm not very convinced why this should be part of the upstream.

, in which the volumesnapshot was initially excluded due to label(s) selector,

Is this a case that always happens? Could you elaborate why?

@sseago
Copy link
Collaborator

sseago commented Mar 18, 2025

@reasonerjt This is to resolve an inconsistency in backup vs. restore behavior when using label selectors. On backup, when using a label selector to back up only certain workloads in a namespace, the created VolumeSnapshot will always be included in the backup because it's returned by the PVC BIA. On restore, however, if using a label selector to only restore certain workloads in a namespace, the VolumeSnapshot won't match the label, so it's excluded. Ordinarily, a straightforward additionalItem return would be all that's needed to pull the VS into the restore -- the problem is that it needs to have been restored before the PVC, so simply adding it to the PVC RIA doesn't help. This PR gets around that by adding the VS and the current PVC again.

That being said, I'm not 100% clear why this is needed for the CSI only case but we don't have a similar problem with DataUpload inclusion in the datamover case. @amastbau Do datamover restores with label selector work fine for you without a similar change?

@sseago
Copy link
Collaborator

sseago commented Mar 18, 2025

@reasonerjt @amastbau We discussed this during the community meeting. The current approach of returning the current PVC as its own additional item (after the needed VolumeSnapshot) doesn't really work. What we actually need is to ensure that the VolumeSnapshot is restored before this pvc_action is run. The answer there is another RIA (maybe pvc_pre_action.go) that is registered to run before this one. That RIA will only return the VS as an additional item (for CSI only case), resulting in the restore workflow restoring the VS (will be a no-op if it's already restored, but is necessary if restore uses a label selector). Then when the next matching RIA is run (the existing csi pvc_action), the VS will already exist, so this RIA will run as-written.

We need the same thing in this new to include the DataUpload for the PVC if we're using datamover for the volume.

@sseago sseago marked this pull request as draft March 18, 2025 16:20
@sseago
Copy link
Collaborator

sseago commented Mar 18, 2025

Converting PR to draft until the new fix is submitted.

@amastbau amastbau changed the title Restore VolumeSnapshot associated with a PVC during a CSI Restore even if it was previously excluded due to label(s) selector Restore VolumeSnapshot associated with a PVC during a CSI Restore even if it is excluded due to label(s) selector Mar 19, 2025
@amastbau amastbau changed the title Restore VolumeSnapshot associated with a PVC during a CSI Restore even if it is excluded due to label(s) selector Restore VolumeSnapshot, VolumeSnapshotcontent and dataupload associated with a PVC during a CSI Restore even if it is excluded from the restore due to label(s) selector Mar 24, 2025
@amastbau
Copy link
Author

amastbau commented Mar 24, 2025

so,
implemented as decided,
we also need to pull volumesnapshotcontent
the vs and vsc name are changed to to guid before restore.
will address those next
@sseago @shubham-pampattiwar

RegisterRestoreItemAction(
constant.PluginCSIPVCVSDURestoreRIA,
newPvcVSDVRestoreItemAction(f),
).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this a few lines down to just before the call to newPvcRestoreItemAction(f) at line 161 since they're related plugin actions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I wrongly assumed that registering that plugin using the Fluent(?) Interface would be enough to set the order.

@@ -21,5 +21,6 @@ const (
ControllerRestoreFinalizer = "restore-finalizer"

PluginCSIPVCRestoreRIA = "velero.io/csi-pvc-restorer"
PluginCSIPVCVSDURestoreRIA = "velero.io/csi-pvc-vs-dv-restorer"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alignment is off on the = -- also, we need this new plugin to be run before "velero.io/csi-pvc-restorer" so when the csi-pvc-restorer is run the VS, DU, etc. are already restored. Maybe change PluginCSIPVCRestoreRIA to "velero.io/csi-pvc-restorer2" and then set PluginCSIPVCVSDURestoreRIA to "velero.io/csi-pvc-restorer1"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git it ^ :) alphabetic order

// NewPvcVSDVRestoreItemAction returns a new instance of PVCVSDVRestoreItemAction.
func NewPvcVSDVRestoreItemAction(f client.Factory) plugincommon.HandlerInitializer {
return func(logger logrus.FieldLogger) (any, error) {
crClient, err := f.KubebuilderClient()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the plugin doesn't actually need crClient, so this can probably be avoided.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is from when was trying the satisfy V2 interface with do nothing functions at 1st, i thought this was the requirement even if not needed.its not a requirement if we are are just pushing backup the additional Items, correct?

AdditionalItems: []velero.ResourceIdentifier{},
}, nil
// Convert the updated VolumeSnapshot to an unstructured object.
vsMap, err := runtime.DefaultUnstructuredConverter.ToUnstructured(&vs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this new code is repeating the code from line 103-107. We already have vsMap. Is this a mere/rebase error?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naa. was copying and pasting coede chunks.thakns

@amastbau
Copy link
Author

@sseago @shubham-pampattiwar
i am going to add vsc to return items and test
will also apply all comments.thanks!

@Lyndon-Li
Copy link
Contributor

Lyndon-Li commented Mar 25, 2025

@sseago
DataUpload to PVC for data mover doesn't have this problem, because DataUpload is an internal object and is always restored (ignoring filters).

}

// Check for DataVolume annotation.
if dvName, ok := pvc.Annotations[velerov1api.DataUploadNameAnnotation]; ok && dvName != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to do this for data mover case, as per #8796 (comment).

@Lyndon-Li
Copy link
Contributor

@sseago @amastbau

Sorry for jumping in late. Share my thoughts of this PR and the discussion here:

  1. For BIA/RIA, supporting pre/post actions is a good idea, I remember we have touched it for multiple scenarios (or multiple scenarios have shown requirements on it). Therefore, I think we need to treat it systematically (we have multiple points to cover, e.g., keeping the order by the alphabetic order is not a good idea), instead of adding a limited implementation for a particular case. And so we need to discuss it in a design and well consider it. Simply speaking, to me, it is a good but a big deal.
  2. For the current issue --- the mismatch behavior of partially backing up and restoring resources involving additionalItems returned by BIA/RIA, first of all, I think it is not a very critical problem, and there are workarounds; secondly, there are multiple ways to solve it. Therefore, I don't think a quick implementation of solution 1 is the best choice.
  3. For the current issue, I also think we need to consider it generally --- is it correct that the additionalItem returned by RIA of all resources must be restored? The answer is YES, then we need to seek a universal solution, instead of doing it just for PVC.
  4. For a universal solution of the current problem, even if we have the pre-RIA, I don't think it is the correct way to write a pre-RIA for all resources. Instead, we need to let the restore process include the additionalItems automatically even if they are excluded manually or not included.
  5. For a universal solution, I don't have a concrete solution for now. As a rough idea, we can mark the additionalItems returned by BIA during backup, then for all these marked items, they CANNOT be skipped during restore. Take the VS-PVC case for an example, VS is not the restore additionalItem of PVC, instead, PVC is the restore additionalItem of VS. So everything existing is working correctly, we only lack a way to tell the restore that VS must be included.

cc @reasonerjt

@sseago
Copy link
Collaborator

sseago commented Mar 25, 2025

I don't think the pre/post distinction in 1) is really relevant to the problem being solved here. All RIAs are pre -- they modify the resource before restore. If multiple RIAs match, then velero explicitly runs them in sorted order by the registered name. The issue here isn't pre/post -- but we have one action that must run before another. On the OpenShift side, we also have scenarios where we have two actions that must be run in a certain order, since the results of action1 are used in action2.

That's exactly what we need here. So the change that does not require any infra-level changes is in this PR (since Velero sorts the actions by registered name, this is guaranteed to run in the desired order). It might be better long-term to avoid the sort-order dependence and provide an optional priority field to the RegisterBackupItemAction/RestoreItemAction funcs -- and then we can sort actions first on priority, secondarily on registered name. That would allow the new action in this PR that adds additional items to be registered with priority 1, and the action that depends on these additional items being already restored to be registered with priority 2.

But again, to be clear, we don't have explicit pre/post actions. We have two different RestoreItemActions that need to run in a specific order since the actions in RIA2 depend on RIA1 already having been run. Current velero code guarantees that actions are sorted by registered name -- but if we want, we can add a way to register a plugin with priority to do this in a slightly clearer way.

@Lyndon-Li
Copy link
Contributor

Here are some more points, let's discuss in the community meeting:

  1. The problem we are trying to solve in this PR is a common problem, any resource may have this problem, not only for PVC. So there should be a universal solution, we don't expect to write the second RIA for each resource with this problem.
  2. By adding another RIA, we are actually trying to express a repeated info that is already in the backup data --- the order. Actually, the order to restore resources is the reverse order that resources are backed up. After backup completes, the backup order is naturally embedded.
  3. Instead, what we really need to express is that one or more prerequisite resources of an included resource should not be excluded in anyway. This should be the direction for the solution.
  4. Ordered BIA/RIA is still an infrastructure level requirement, relying on the alphabet order of the actions is not good enough, unless we are fixing a very critical problem temporarily.

@sseago
Copy link
Collaborator

sseago commented Mar 25, 2025

@Lyndon-Li
"For a universal solution, I don't have a concrete solution for now. As a rough idea, we can mark the additionalItems returned by BIA during backup, then for all these marked items, they CANNOT be skipped during restore. Take the VS-PVC case for an example, VS is not the restore additionalItem of PVC, instead, PVC is the restore additionalItem of VS. So everything existing is working correctly, we only lack a way to tell the restore that VS must be included."

Yes, I would agree that this would be desirable. One possibility would be to label these items in the backup with something like "velero.io/must-restore=true" and then add (internally) add that to the "or" label selector when filtering by label on restore.

@sseago
Copy link
Collaborator

sseago commented Mar 25, 2025

@Lyndon-Li "Ordered BIA/RIA is still an infrastructure level requirement, relying on the alphabet order of the actions is not good enough, unless we are fixing a very critical problem temporarily."
Thinking about this (and again, maybe this is solved separately from the additional items requirement), right now we have RegisterBackupItemActionV2(name string, initializer common.HandlerInitializer) -- maybe the answer is to add a secondary registration method: RegisterPriorityBackupItemActionV2(name string, priority int, initializer common.HandlerInitializer) and then when sorting the item actions, sort first by priority, then by name within a priority. We'll need to decide whether the default using the regular register method results in running last or (maybe better) some middle priority so that custom actions can be registered to run first or last. For example, default priority is 100 -- so for a plugin that must run first, it could be registered with priority 1, if it needs to run last, then it could be registered with a high number (200, 1000, etc.)

@reasonerjt
Copy link
Contributor

is it correct that the additionalItem returned by RIA of all resources must be restored? The answer is YES, then we need to seek a universal solution, instead of doing it just for PVC.

We used to introduce an annotation to bypass the filtering in backup process:

mustInclude := u.GetAnnotations()[velerov1api.MustIncludeAdditionalItemAnnotation] == "true" || finalize

Maybe it's a low-hanging fruit for us to provide a "must-have" semantic to restore if we keep this annotation?

@sseago
Copy link
Collaborator

sseago commented Mar 25, 2025

So to summarize, I think there are two different general problems at work here, and one specific problem that is exposing this:

  1. On backup, additional items returned from BIAs are exempt from label filtering. On restore, we need the same label filtering exception (i.e. if I backup with label=foo, then restore with the same label=foo selector, everything from that backup should get restored). We either need RIAs to return these same additional items, or we need to mark additional items in the backup in a way to prevent them from being filtered on restore.
  2. When there are multiple BIAs/RIAs that act on a given resource, we want a clear way of indicating the order that they run in. Currently, the order is specified by sorting the name of the registered action. We'd rather have an explicit priority field -- optional, so actions that don't depend on order can be registered just like they are now, resulting in them being assigned some default priority.

And specifically: When a restore is done with a label selector, VolumeSnapshots created by Velero will not have the label, so they will be omitted from the restore. Solution 1) above would solve this, or solution 2) above would enable this by allowing us to register a high priority RIA which returns the VS as an additional item.

The challenge in 1) is knowing which additional items to return. For example, say I back up an entire namespace which includes Pod1/PVC1/VS1 where the pod is labeled with Label1, and Pod2/PVC2/VS2 where the pod is labeled with Label2. If I back up without label selectors, then restore with a selector that matches only Label1, currently I don't get VS1 -- and depending on what other RIAs run, I may not get PVC1 either. However, if we simply restore all additional item returns from backup, that will give us both PVC1/VS1 and PVC/VS2. How do we ensure that we only get PVC1/VS1 when the backup didn't specify label1? There may not be an easy way to do this generically, which may require us to add these explicitly as RIA additional items like we do for BIA. For Pod this is easy enough and won't require a new action, just add the PVCs to the existing pod action. For CSI PVC action, though, the VS must exist when the current CSI PVC action runs, so we'd need to use option 2) and register a plugin with high priority that returns the VS as an additional item, guaranteeing that this runs before the regular plugin.

@reasonerjt
Copy link
Contributor

As for ordered BIA/RIA, I wish to suggest we clarify if there's really a concrete scenario -- even there is, is it possible to refactor the code to merge some plugins so that the code is ordered in one Execute func?

We allow different plugins to handle the same resource and I don't think we can completely avoid conflicts across plugins

@sseago
Copy link
Collaborator

sseago commented Mar 25, 2025

@reasonerjt

is it correct that the additionalItem returned by RIA of all resources must be restored? The answer is YES, then we need to seek a universal solution, instead of doing it just for PVC.

We used to introduce an annotation to bypass the filtering in backup process:

mustInclude := u.GetAnnotations()[velerov1api.MustIncludeAdditionalItemAnnotation] == "true" || finalize

Maybe it's a low-hanging fruit for us to provide a "must-have" semantic to restore if we keep this annotation?

The problem is that it's not necessarily must-have if the restore is filtering on labels. For example, say I back up an entire namespace which includes Pod1/PVC1/VS1 where the pod is labeled with Label1, and Pod2/PVC2/VS2 where the pod is labeled with Label2. If I back up without label selectors, then restore with a selector that matches only Label1, currently I don't get VS1 -- and depending on what other RIAs run, I may not get PVC1 either. However, if all of these additional item returns added the must-restore annotation, that will give us both PVC1/VS1 and PVC/VS2. But we only want PVC1/VS1 -- since we're not restoring Pod2, we don't want that pod's PVC or VS.

We may have to actually add these additional item returns explicitly to RIAs.

@shubham-pampattiwar
Copy link
Collaborator

@sseago Would it solve the case if the VS object was to inherit the labels from the PVC object ? We control the VS creation, so that seems doable.

@sseago
Copy link
Collaborator

sseago commented Mar 25, 2025

As for ordered BIA/RIA, I wish to suggest we clarify if there's really a concrete scenario -- even there is, is it possible to refactor the code to merge some plugins so that the code is ordered in one Execute func?

We allow different plugins to handle the same resource and I don't think we can completely avoid conflicts across plugins

The problem is we can't always do it in one execute. The concrete example is the one we're trying to solve here. We need to return the VolumeSnapshot as an addiitonal item to restore, but the current CSI PVC action requires the VS to already exist. If we registered ria1 with an explicit priority (see suggestion above) to always run before ria2, then ria1 could return the VS to restore as an additional item, and when ria2 is called later, the VS will have been restored, so ria2's Execute func can access it. There is no conflict as long as we can predict the order that velero runs the actions in. Currently the code guarantees order via string sort (which as pointed out above, isn't really a great API for it), but if we added an explicit priority field to the Register funcs, then we would have a better API for ensuring order, allowing any scenario where we need multiple BIAs or RIAs to run in a particular order to work fine with no conflicts. Since the modified item metadata from the first action is passed on to the second action, we don't really need to worry about conflicts as long as plugin authors set priority properly in cases where the order matters -- in most cases it won't matter, such as the case today with the CSI PVC action and the non-CSI PVC action. They do different things, they don't conflict regardless of the order they run in.

@sseago
Copy link
Collaborator

sseago commented Mar 25, 2025

@sseago Would it solve the case if the VS object was to inherit the labels from the PVC object ? We control the VS creation, so that seems doable.

@shubham-pampattiwar
The problem is that the PVC won't be labeled in the general case. As a user, I label my Deployment (and pod template spec), PVC gets pulled in via additional items, which then pulls in PV and VS, etc. Since we're not creating the PVC, though, the Pod BIA can't add labels to the PVC.

@Lyndon-Li
Copy link
Contributor

@sseago
I will wait for you to open the issue. Right now, let me add the below proposal for the problem here and if the issue is created, I can paste it there.

Proposal
I think the contradiction is that we need to make sure that some resources (e.g., vs, dataUpload) bypass all filters applied on themselves, but refer to the the filters on another "pillar" resource (e.g., pvc). But the "pillar" resource is behind them in the restore order, as well as the restore collection order.

As a result, we have to skip the resources at the first place and later we want to pull them back.
So it is not really about the misbehavior of resource restore order, resource restore order can always be guaranteed.
Instead, the problem is on the filter mechanisms to work with the restore order or restore collection order.

And if visit back is inevitable, we would rather do it during restore collection, because it is much easier.
So the solution is to add a kind of Delay-Decision mechanism during the restore collection, so that the restore workflow could keep as is:

  1. Modify the restoreableItem struct and add a new boolean field imaginary
  2. Modify the getSelectedRestoreableItems function
    • add a new map called delayDecision with a slice value containing the pointers to restoreableItem and key as the "pillar" object name
    • for each resource in resourceMustHave:
      • keep the existing code to bypass the filters for it, construct a restoreableItem for it and set imaginary as true
      • add the restoreableItem into delayDecision indexed by its "pillar" object name
      • the restoreableItem will also be added to restorable.selectedItemsByNamespace by the end
    • when the "pillar" is visited and decided to be included later
      • search delayDecision
      • if found, set all restoreableItem.imaginary as false
  3. Make a small change to processSelectedResource function
    • for each selectedItem (type of restoreableItem), check its imaginary field
    • if it is true, skip processing it

The only trade-off for this solution is that we don't have a generic way to make an abstract of telling the "pillar" object from any given object, so right now, we assume that there are two concrete ones vs and dataUpload.
There is no limitation from users perspective.

cc @reasonerjt @blackpiglet

@sseago
Copy link
Collaborator

sseago commented Mar 26, 2025

@Lyndon-Li I think that's mostly correct -- note that PVC itself is another resource with this issue. i.e. consider the use case where the pod has the label. so when processing VS, we need to go from VS to PVC to Pod to find the label, then when processing PVC, we need to go from PVC to Pod to find the label.

@sseago
Copy link
Collaborator

sseago commented Mar 26, 2025

@Lyndon-Li #8816

@sseago
Copy link
Collaborator

sseago commented Mar 26, 2025

@Lyndon-Li I'm closing this PR now, since we'll handle the short-term fix downstream, and the long-term fix is tracked in the new issue created today.

@sseago sseago closed this Mar 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants