-
Notifications
You must be signed in to change notification settings - Fork 680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add retry for oauth metadata endpoint #4259
Conversation
@@ -71,10 +79,15 @@ func (s OAuth2MetadataProvider) GetOAuth2Metadata(ctx context.Context, r *servic | |||
} | |||
httpClient.Transport = transport | |||
} | |||
|
|||
response, err := httpClient.Get(externalMetadataURL.String()) | |||
logger.Printf(ctx, "retryAttempts: %v retryDuration: %v", s.cfg.AppAuth.ExternalAuthServer.RetryAttempts, s.cfg.AppAuth.ExternalAuthServer.RetryDelayMilliseconds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here for testing
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #4259 +/- ##
==========================================
+ Coverage 59.05% 59.24% +0.18%
==========================================
Files 621 621
Lines 53106 53147 +41
==========================================
+ Hits 31360 31485 +125
+ Misses 19246 19151 -95
- Partials 2500 2511 +11
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
if response != nil { | ||
logger.Errorf(ctx, "Failed to get oauth metadata. Error code: %v. Err: %v", response.StatusCode, err) | ||
return nil, flyteErrors.NewFlyteAdminError(codes.Code(response.StatusCode), "Failed to get oauth metadata.") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we expected to see a non nil response on err, I would consider not considering this case, we have special handling around responses in error cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not expected! Just wanted to add an edge case here, happy to remove it though! I did not know we had special handling around responses in error cases :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this particular case, you are getting the statusCode from the response so we can keep it.
Also on L90 , 500 is not a valid grpc error code, I would use code.Internal there
@@ -10,6 +11,24 @@ import ( | |||
// RetryDelay indicates how long to wait between restarting a subscriber connection in the case of network failures. | |||
var RetryDelay = 30 * time.Second | |||
|
|||
func RetryOnSpecificErrorCodes(totalAttempts int, delay time.Duration, f func() (*http.Response, error), IsErrorCodeRetryable func(*http.Response) bool) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we instead reuse this one which comes from K8s utils. We are using this in flyteadmin already https://github.com/flyteorg/flyte/blob/master/flyteadmin/scheduler/executor/executor_impl.go#L100
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely can, is there any specific reason as to why you're wanting to use this one? We use the other retry functions in shared.go elsewhere in FlyteAdmin as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. For consitency across our repos , would be good to use one if they provide the same functionality.
In cloud repo too we use the k8s retry util in grafana dataplane with usage of RetryOnConflict and hence was seeing if we could use the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah gotcha! makes sense, happy to migrate over to that :) will work on that now
Signed-off-by: Chao-Heng Lee <[email protected]> Signed-off-by: squiishyy <[email protected]>
Signed-off-by: squiishyy <[email protected]>
…ttprequestwithretry, need to work on config mappingn still, tests in Signed-off-by: squiishyy <[email protected]>
Signed-off-by: squiishyy <[email protected]>
Signed-off-by: squiishyy <[email protected]>
Signed-off-by: squiishyy <[email protected]>
Signed-off-by: squiishyy <[email protected]>
Signed-off-by: squiishyy <[email protected]>
Signed-off-by: squiishyy <[email protected]>
Signed-off-by: squiishyy <[email protected]>
Signed-off-by: squiishyy <[email protected]>
0644587
to
3838ead
Compare
Closed in favor of: #4262 |
Tracking issue
https://unionai.atlassian.net/browse/CLOUD-175
Describe your changes
Added functionality to retry getting oauth metadata information upon 5xx error codes.
Added config values to be able to specify number of retry attempts and retry delay duration.
Added tests as well!
Check all the applicable boxes
Screenshots
Note to reviewers