[Feature]: Support retry policies when calling completion() / text_completion() without requiring Router #6623

dbczumar · 2024-11-06T19:15:17Z

The Feature

Support retry policies when calling completion() / text_completion() without requiring Router. Example:

import litellm
from litellm import RetryPolicy

retry_policy = RetryPolicy(
    TimeoutErrorRetries=num_retries,
    RateLimitErrorRetries=num_retries,
    InternalServerErrorRetries=num_retries,
    # We don't retry on errors that are unlikely to be transient
    # (e.g. bad request, invalid auth credentials)
    BadRequestErrorRetries=0,
    AuthenticationErrorRetries=0,
    ContentPolicyViolationErrorRetries=0,
)

litellm.completion(
    model="openai/gpt-4o-mini",
    messages=[{"role": "user", "content": "Example content"}],
    retry_strategy="exponential_backoff_retry",
    retry_policy=retry_policy,
)

Motivation, pitch

The DSPy library (https://github.com/stanfordnlp/dspy) depends on LiteLLM for issuing LLM calls. When these calls fail due to transient network errors or rate limiting, we want to retry with exponential backoff. However, when these calls fail due to user error (e.g. bad API keys, malformed requests), we want to fail fast.

DSPy users configure LLM keys and parameters using constructor arguments to the dspy.LM class (and optionally be setting environment variables like `OPENAI_API_KEY'), for example:

llm = dspy.LM(model="openai/gpt-4o-mini", api_key="<my key>", model_type="chat")
llm("Who invented deep learing?")

# Env var alternative
os.environ["OPENAI_API_KEY"] = "<my_key>"
llm = dspy.LM(model="openai/gpt-4o-mini", model_type="chat")
llm("Who invented deep learnng?")

DSPy currently wraps litellm.completion() and litellm.text_completion() to implement this interface. See https://github.com/stanfordnlp/dspy/blob/8bc3439052eb80ba4e5ba340c348a6e3b2c94d7c/dspy/clients/lm.py#L78-L87 / https://github.com/stanfordnlp/dspy/blob/8bc3439052eb80ba4e5ba340c348a6e3b2c94d7c/dspy/clients/lm.py#L166-L216. Currently, these interfaces don't support specifying a retry policy.

We've attempted to work around this by constructing a Router internally, but Router construction requires us to fetch the api key and base and pass them to a model_list (due to OpenAI / Azure OpenAI initialization -

litellm/litellm/router.py

Lines 3999 to 4001 in 45ff74a

    
           InitalizeOpenAISDKClient.set_client( 
        
               litellm_router_instance=self, model=deployment.to_json(exclude_none=True) 
        
           )

), which is difficult if those keys are stored in environment variables.

Twitter / LinkedIn details

No response

The text was updated successfully, but these errors were encountered:

dbczumar · 2024-11-06T19:20:19Z

Hi @krrishdholakia, can you advise regarding how to support configuring retry policies without Router? We're happy to contribute the change if it's fairly straightforward :D (though any bandwidth you have on your side to support it would be massively appreciated).

cc @okhat

krrishdholakia · 2024-11-06T19:57:18Z

@ishaan-jaff do we still need openai/azure client init on router?

iirc you implemented some sort of client caching logic on the .completion call already, right?

krrishdholakia · 2024-11-06T19:59:07Z

i wonder how hard it would be to just move the async_function_with_retries outside the router, and use that inside the wrapper_async / wrapper functions

ishaan-jaff · 2024-11-06T19:59:36Z

do we still need openai/azure client init on router?

Nope, we don't it's probably better to have this on completion level

dbczumar · 2024-11-11T19:06:06Z

Hi @krrishdholakia @ishaan-jaff , thank you so much for the ideation here! What's best regarding next steps for getting this implemented?

dbczumar · 2024-11-26T11:42:52Z

@krrishdholakia @ishaan-jaff I gave this a try based on your suggestion. Let me know how it looks - #6916. Happy to split this up for reviewability / adjust course as needed.

dbczumar added the enhancement New feature or request label Nov 6, 2024

dbczumar mentioned this issue Nov 6, 2024

Add test coverage for caching against a LiteLLM test server stanfordnlp/dspy#1769

Merged

dbczumar linked a pull request Nov 26, 2024 that will close this issue

Support retry policy for completion / acompletion #6916

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Support retry policies when calling completion() / text_completion() without requiring Router #6623

[Feature]: Support retry policies when calling completion() / text_completion() without requiring Router #6623

dbczumar commented Nov 6, 2024 •

edited

Loading

dbczumar commented Nov 6, 2024 •

edited

Loading

krrishdholakia commented Nov 6, 2024

krrishdholakia commented Nov 6, 2024

ishaan-jaff commented Nov 6, 2024

dbczumar commented Nov 11, 2024

dbczumar commented Nov 26, 2024

[Feature]: Support retry policies when calling completion() / text_completion() without requiring Router #6623

[Feature]: Support retry policies when calling completion() / text_completion() without requiring Router #6623

Comments

dbczumar commented Nov 6, 2024 • edited Loading

The Feature

Motivation, pitch

Twitter / LinkedIn details

dbczumar commented Nov 6, 2024 • edited Loading

krrishdholakia commented Nov 6, 2024

krrishdholakia commented Nov 6, 2024

ishaan-jaff commented Nov 6, 2024

dbczumar commented Nov 11, 2024

dbczumar commented Nov 26, 2024

dbczumar commented Nov 6, 2024 •

edited

Loading

dbczumar commented Nov 6, 2024 •

edited

Loading