[Feature]: support tracking remaining tpm/rpm for gemini models #6914

krrishdholakia · 2024-11-26T10:00:25Z

The Feature

Support tracking remaining tpm/rpm for gemini models

allows prometheus metrics to capture this information

User config.yaml

model_list: 
   - model_name: gemini/*
      litellm_params: 
         model: gemini/*

Motivation, pitch

allows user to know how often they're hitting their rate limits on gemini

Twitter / LinkedIn details

No response

krrishdholakia · 2024-11-26T10:10:05Z

add the router cache to the litellm logging object
in litellm logging -> update when a successful user request is made with the tpm/rpm key containing the model id + model name
on router success -> get_model_info on router should handle wildcard routes - add any litellm model cost map info to the user set config information -> return a model info object containing the gemini model-specific tpm/rpm limits
get_tokens_and_requests(model_id: str, model: str, router: Router) -> returns the current tpm/rpm values for a given model_id + model name
get_remaining_tokens_and_requests(model_id: str, model: str, router: Router) -> returns the diff b/w the set limit and the current value

krrishdholakia · 2024-11-26T10:10:35Z

Todo:

get_model_info on router should handle wildcard routes
add tpm/rpm limits for all gemini models on model cost map
update router deployment_callback_on_success and deployment_callback_on_failure to track tpm/rpm on success
~~implement get_tokens_and_requests~~ router.get_model_group_usage() already exists
implement get_remaining_tokens_and_requests

Addresses #6914

…ll gemini models Allows for ratelimit tracking for gemini models even with wildcard routing enabled Addresses #6914

…uter Addresses #6914

Addresses #6914

krrishdholakia added enhancement New feature or request mlops user request labels Nov 26, 2024

krrishdholakia added a commit that referenced this issue Nov 26, 2024

feat(router.py): support wildcard routes in get_router_model_info()

7f346ba

Addresses #6914

krrishdholakia added a commit that referenced this issue Nov 26, 2024

build(model_prices_and_context_window.json): add tpm/rpm limits for a…

1b99371

…ll gemini models Allows for ratelimit tracking for gemini models even with wildcard routing enabled Addresses #6914

krrishdholakia added a commit that referenced this issue Nov 26, 2024

feat(router.py): add tpm/rpm tracking on success/failure to global_ro…

734d00b

…uter Addresses #6914

krrishdholakia added a commit that referenced this issue Nov 27, 2024

fix(router.py): implement get_remaining_tokens_and_requests

ec6bbc4

Addresses #6914

krrishdholakia linked a pull request Nov 27, 2024 that will close this issue

LiteLLM Minor Fixes & Improvements (11/26/2024) #6913

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: support tracking remaining tpm/rpm for gemini models #6914

[Feature]: support tracking remaining tpm/rpm for gemini models #6914

krrishdholakia commented Nov 26, 2024 •

edited

Loading

krrishdholakia commented Nov 26, 2024

krrishdholakia commented Nov 26, 2024 •

edited

Loading

[Feature]: support tracking remaining tpm/rpm for gemini models #6914

[Feature]: support tracking remaining tpm/rpm for gemini models #6914

Comments

krrishdholakia commented Nov 26, 2024 • edited Loading

The Feature

Motivation, pitch

Twitter / LinkedIn details

krrishdholakia commented Nov 26, 2024

krrishdholakia commented Nov 26, 2024 • edited Loading

krrishdholakia commented Nov 26, 2024 •

edited

Loading

krrishdholakia commented Nov 26, 2024 •

edited

Loading