Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: support tracking remaining tpm/rpm for gemini models #6914

Open
krrishdholakia opened this issue Nov 26, 2024 · 2 comments · May be fixed by #6913
Open

[Feature]: support tracking remaining tpm/rpm for gemini models #6914

krrishdholakia opened this issue Nov 26, 2024 · 2 comments · May be fixed by #6913
Labels
enhancement New feature or request mlops user request

Comments

@krrishdholakia
Copy link
Contributor

krrishdholakia commented Nov 26, 2024

The Feature

Support tracking remaining tpm/rpm for gemini models

  • allows prometheus metrics to capture this information

User config.yaml

model_list: 
   - model_name: gemini/*
      litellm_params: 
         model: gemini/*

Motivation, pitch

allows user to know how often they're hitting their rate limits on gemini

Twitter / LinkedIn details

No response

@krrishdholakia
Copy link
Contributor Author

  • add the router cache to the litellm logging object
  • in litellm logging -> update when a successful user request is made with the tpm/rpm key containing the model id + model name
  • on router success -> get_model_info on router should handle wildcard routes - add any litellm model cost map info to the user set config information -> return a model info object containing the gemini model-specific tpm/rpm limits
  • get_tokens_and_requests(model_id: str, model: str, router: Router) -> returns the current tpm/rpm values for a given model_id + model name
  • get_remaining_tokens_and_requests(model_id: str, model: str, router: Router) -> returns the diff b/w the set limit and the current value

@krrishdholakia
Copy link
Contributor Author

krrishdholakia commented Nov 26, 2024

Todo:

  • get_model_info on router should handle wildcard routes
  • add tpm/rpm limits for all gemini models on model cost map
  • update router deployment_callback_on_success and deployment_callback_on_failure to track tpm/rpm on success
  • implement get_tokens_and_requests router.get_model_group_usage() already exists
  • implement get_remaining_tokens_and_requests

krrishdholakia added a commit that referenced this issue Nov 26, 2024
…ll gemini models

Allows for ratelimit tracking for gemini models even with wildcard routing enabled

Addresses #6914
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request mlops user request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant