Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue #6807: Retry failed create when using generateName #6830

Merged
merged 1 commit into from
Oct 11, 2023

Conversation

sseago
Copy link
Collaborator

@sseago sseago commented Sep 15, 2023

When creating resources with generateName, apimachinery does not guarantee uniqueness when it appends the random suffix to the generateName stub, so if it fails with already exists error, we need to retry.

Thank you for contributing to Velero!

Please add a summary of your change

Does your change fix a particular issue?

Fixes #6807

Please indicate you've done the following:

  • [x ] Accepted the DCO. Commits without the DCO will delay acceptance.
  • [x ] Created a changelog file or added /kind changelog-not-required as a comment on this pull request.
  • Updated the corresponding documentation in site/content/docs/main.

@codecov
Copy link

codecov bot commented Sep 15, 2023

Codecov Report

Merging #6830 (09be1f7) into main (541425b) will decrease coverage by 0.06%.
The diff coverage is 39.28%.

@@            Coverage Diff             @@
##             main    #6830      +/-   ##
==========================================
- Coverage   60.67%   60.61%   -0.06%     
==========================================
  Files         249      250       +1     
  Lines       26476    26498      +22     
==========================================
- Hits        16064    16063       -1     
- Misses       9268     9290      +22     
- Partials     1144     1145       +1     
Files Coverage Δ
pkg/podvolume/backupper.go 83.26% <100.00%> (+0.25%) ⬆️
pkg/podvolume/restorer.go 90.15% <100.00%> (+0.15%) ⬆️
pkg/repository/ensurer.go 100.00% <100.00%> (ø)
pkg/restore/dataupload_retrieve_action.go 75.75% <100.00%> (ø)
pkg/cmd/cli/backup/delete.go 70.51% <0.00%> (ø)
pkg/controller/gc_controller.go 56.07% <0.00%> (ø)
pkg/client/retry.go 0.00% <0.00%> (ø)

... and 1 file with indirect coverage changes

@@ -40,7 +42,9 @@ type DefaultServerStatusGetter struct {
func (g *DefaultServerStatusGetter) GetServerStatus(kbClient kbclient.Client) (*velerov1api.ServerStatusRequest, error) {
created := builder.ForServerStatusRequest(g.Namespace, "", "0").ObjectMeta(builder.WithGenerateName("velero-cli-")).Result()

if err := kbClient.Create(context.Background(), created, &kbclient.CreateOptions{}); err != nil {
if err := retry.OnError(retry.DefaultRetry, apierrors.IsAlreadyExists, func() error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make sense to check for generateName metadata before attempting to retry.

If object does not use generateName, retrying will just be redundant.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm only adding this code in places where generateName is always set. If this were in a reusable func, then it would make sense to check that, but since the reusable func would end up just being a thin wrapper around retry.OnError, I'm not sure it's warranted here. Also, most of these calls involve interface objects so we'd have to attempt to cast to ObjectMeta, then look for GenerateName. I'm leaning towards thinking we don't need this since every time we call OnError it's right after explicitly setting generateName.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon further reflection, I think this is a good idea, actually -- I'll move the retry logic into a util func which either does retries or a simple one-time call, depending on GenerateName/Name values.

@kaovilai
Copy link
Member

I think it would be reasonable for velero to by default cleanup completed status CRs since we generate a lot of them.

@kaovilai
Copy link
Member

like here #5683

@sseago
Copy link
Collaborator Author

sseago commented Sep 15, 2023

I think status cleanup would be a separate thing. I'd like to limit this PR to fixing the GenerateName bug.

@@ -22,9 +22,11 @@ import (

"github.com/pkg/errors"
"github.com/spf13/cobra"
apierrors "k8s.io/apimachinery/pkg/api/errors"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use camelCase: apierrors -> apiErrors

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mateusoliveira43 For package names (and package name aliases) the general convention is all lowercase

@sseago sseago force-pushed the retry-generateName branch 2 times, most recently from d3b90e9 to 3297d13 Compare September 19, 2023 13:51
@kaovilai
Copy link
Member

Related upstream k8s issue: kubernetes/kubernetes#115489

@sseago sseago marked this pull request as ready for review September 19, 2023 23:41
@@ -0,0 +1 @@
Retry failed create when using generateName
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Retry failed create when using generateName
Retry create calls during restore when using generateName to resolve `the server was not able to generate a unique name for the object` errors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaovilai This doesn't relate to restore. This is when Velero creates CRs with GenerateName. Anything we're restoring already has Name set, so GenerateName isn't involved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this file for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaovilai auto-created backup file that accidentally got added to the commit. Will remove.


func CreateRetryGenerateName(client kbclient.Client, ctx context.Context, obj kbclient.Object) error {
return CreateRetryGenerateNameWithFunc(obj, func() error {
return client.Create(ctx, obj, &kbclient.CreateOptions{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should CreateOptions be parameterized?

FieldManager could become handy in the future. Also, this maybe what we should be using to restore managedFields during create call.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaovilai There weren't any cases in the current code that used it, so it seemed simpler to avoid it. If we need to add CreateOptions in the future, we can add it. I can add it now (and just pass in the empty CreateOptions in all cases), but my thinking was that this was easier to read/maintain without it until we needed it. I don't have a strong opinion there, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine as is, just read on the options and saw ability to add managedfields which you might be interested in

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaovilai Yes, at some point we should probably open a separate issue to investigate better ways to restore managed fields. If we can avoid the extra Patch, that would be great. In any case, none of this code is used for resource restore, since we'll never restore anything that doesn't already have a name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool.

When creating resources with generateName, apimachinery
does not guarantee uniqueness when it appends the random
suffix to the generateName stub, so if it fails with
already exists error, we need to retry.

Signed-off-by: Scott Seago <[email protected]>
@shubham-pampattiwar shubham-pampattiwar merged commit 74ed994 into vmware-tanzu:main Oct 11, 2023
22 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants