Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retry for oauth metadata endpoint #4259

Closed
wants to merge 11 commits into from

Conversation

squiishyy
Copy link
Contributor

@squiishyy squiishyy commented Oct 18, 2023

Tracking issue

https://unionai.atlassian.net/browse/CLOUD-175

Describe your changes

Added functionality to retry getting oauth metadata information upon 5xx error codes.
Added config values to be able to specify number of retry attempts and retry delay duration.
Added tests as well!

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Screenshots

Note to reviewers

@squiishyy squiishyy changed the title Add retry for oauth metadata Add retry for oauth metadata 2 Oct 18, 2023
@squiishyy squiishyy changed the title Add retry for oauth metadata 2 Add retry for oauth metadata endpoint Oct 18, 2023
@squiishyy squiishyy marked this pull request as ready for review October 18, 2023 18:37
@@ -71,10 +79,15 @@ func (s OAuth2MetadataProvider) GetOAuth2Metadata(ctx context.Context, r *servic
}
httpClient.Transport = transport
}

response, err := httpClient.Get(externalMetadataURL.String())
logger.Printf(ctx, "retryAttempts: %v retryDuration: %v", s.cfg.AppAuth.ExternalAuthServer.RetryAttempts, s.cfg.AppAuth.ExternalAuthServer.RetryDelayMilliseconds)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here for testing

@codecov
Copy link

codecov bot commented Oct 18, 2023

Codecov Report

Attention: 13 lines in your changes are missing coverage. Please review.

Comparison is base (96e3497) 59.05% compared to head (03fc38a) 59.24%.
Report is 2 commits behind head on master.

❗ Current head 03fc38a differs from pull request most recent head 3838ead. Consider uploading reports for the commit 3838ead to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4259      +/-   ##
==========================================
+ Coverage   59.05%   59.24%   +0.18%     
==========================================
  Files         621      621              
  Lines       53106    53147      +41     
==========================================
+ Hits        31360    31485     +125     
+ Misses      19246    19151      -95     
- Partials     2500     2511      +11     
Flag Coverage Δ
unittests 59.24% <70.45%> (+0.18%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
flyteadmin/auth/config/config.go 71.42% <ø> (ø)
flyteadmin/auth/config/config_flags.go 63.15% <100.00%> (+1.33%) ⬆️
flyteadmin/pkg/async/shared.go 87.87% <73.33%> (-12.13%) ⬇️
flyteadmin/auth/authzserver/metadata_provider.go 71.13% <66.66%> (-0.10%) ⬇️

... and 7 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

flyteadmin/auth/authzserver/metadata_provider.go Outdated Show resolved Hide resolved
Comment on lines 85 to 88
if response != nil {
logger.Errorf(ctx, "Failed to get oauth metadata. Error code: %v. Err: %v", response.StatusCode, err)
return nil, flyteErrors.NewFlyteAdminError(codes.Code(response.StatusCode), "Failed to get oauth metadata.")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we expected to see a non nil response on err, I would consider not considering this case, we have special handling around responses in error cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not expected! Just wanted to add an edge case here, happy to remove it though! I did not know we had special handling around responses in error cases :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this particular case, you are getting the statusCode from the response so we can keep it.
Also on L90 , 500 is not a valid grpc error code, I would use code.Internal there

@@ -10,6 +11,24 @@ import (
// RetryDelay indicates how long to wait between restarting a subscriber connection in the case of network failures.
var RetryDelay = 30 * time.Second

func RetryOnSpecificErrorCodes(totalAttempts int, delay time.Duration, f func() (*http.Response, error), IsErrorCodeRetryable func(*http.Response) bool) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we instead reuse this one which comes from K8s utils. We are using this in flyteadmin already https://github.com/flyteorg/flyte/blob/master/flyteadmin/scheduler/executor/executor_impl.go#L100

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely can, is there any specific reason as to why you're wanting to use this one? We use the other retry functions in shared.go elsewhere in FlyteAdmin as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. For consitency across our repos , would be good to use one if they provide the same functionality.
In cloud repo too we use the k8s retry util in grafana dataplane with usage of RetryOnConflict and hence was seeing if we could use the same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah gotcha! makes sense, happy to migrate over to that :) will work on that now

chaohengstudent and others added 11 commits October 18, 2023 16:51
…ttprequestwithretry, need to work on config mappingn still, tests in

Signed-off-by: squiishyy <[email protected]>
Signed-off-by: squiishyy <[email protected]>
Signed-off-by: squiishyy <[email protected]>
Signed-off-by: squiishyy <[email protected]>
Signed-off-by: squiishyy <[email protected]>
Signed-off-by: squiishyy <[email protected]>
Signed-off-by: squiishyy <[email protected]>
Signed-off-by: squiishyy <[email protected]>
@squiishyy
Copy link
Contributor Author

Closed in favor of: #4262

@squiishyy squiishyy closed this Oct 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants