-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RHELAI-2756 Adding Adaptive Throttling #476
RHELAI-2756 Adding Adaptive Throttling #476
Conversation
I pulled this locally and attempted to verify this fix. What I did was run a Mock OpenAI server locally (using https://github.com/polly3d/mockai with a response delay set to 30 seconds) and then edited my generate_data.py to set Running
So, I'm hitting the new logging warning about Retryable errors. And, that should be adjusting the throttler down. But, it doesn't appear to actually be retrying that failed request that timed out? |
Hmm. Good question @bbrowning. From my understanding, the logs that you got indicate that this was activated:
From our collective understanding, OpenAIError is potentially thrown back to us only after the retries have already been attempted (correct me if I am wrong). In that case, I think the logging message could be clearer, but the behavior is as expected(?): the thread of request that was sent through OpenAI client was retried a couple times, with backoff, and failed still, and gracefully died without succeeding, raising its OpenAIError (I am assuming), and we caught and reported it as such. This understanding of mine of the behavior was from the Perf. team who said:
Did your generate 'fail' entirely at the end, or were there these warnings but the generation completed successfully at the end? |
Signed-off-by: eshwarprasadS <[email protected]>
Signed-off-by: eshwarprasadS <[email protected]>
…thor Name [email protected] Signed-off-by: eshwarprasadS <[email protected]>
9ee943d
to
31fa4ba
Compare
My generation did fail entirely at the end, because it had zero generated output as the requests that timed out were ultimately "lost", in that after we received a timeout those requests were dropped and we attempted no further data generation from that input sample. |
Do we want to leave this PR open for now? Or do we want to close it with the expectation that #484 fixes the underlying issue we observed? |
Why Adaptive Throttling?
Fixes #424
The Problem:
If too many requests are sent concurrently to the server, it might:
The Solution:
Key Components in the Code
The AdaptiveThrottler Class
This class encapsulates the logic for adjusting concurrency:
Parameters: