-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Patch: deberta success rate throttle #231
Conversation
@@ -651,6 +652,9 @@ async def tlm_prompt( | |||
await handle_tlm_client_error_from_resp(res, batch_index) | |||
await handle_tlm_api_error_from_resp(res, batch_index) | |||
|
|||
if not res_json.get("nli_deberta_success", True): | |||
raise TlmPartialSuccess("Partial failure on deberta call -- slowdown request rate.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this raise
result in the return
not actually returning the results to the user / trigger retries? Not entirely certain of the logic behind how the errors are handled here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, not currently -- the context manager in the rate handler will swallow this type of exception, making it so that execution continues (returning the result to users)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
See thread in backend PR