Feature: Configure RPM for specific models #764

usmanovbf · 2024-07-01T15:25:04Z

Issue

Hi! First of all, thank you for such unique tool.
I wonder, is it possible to set request per minute amount? For example, if I use free version of Gemini, it allows 2 RPM https://ai.google.dev/pricing . So, I am getting the error now.

Unexpected error: litellm.RateLimitError: litellm.RateLimitError: VertexAIException - b'[{\n  "error": {\n    "code": 429,\n    "message": "Resource
has been exhausted (e.g. check quota).",\n    "status": "RESOURCE_EXHAUSTED"\n  }\n}\n]'

Since aider uses litellm, it will be greate to pass litellm settings from some YAML file and use rpm attribute, for example, like this https://litellm.vercel.app/docs/proxy/reliability#step-1---set-deployments-on-config

Please, make an attention to this, since it will help to bring more delicate fine tuning.

Thank you!.

Version and model info

Aider v0.40.6
Model: gemini/gemini-1.5-pro-latest with diff-fenced edit format
Git repo: .git with 6 files
Repo-map: using 1024 tokens

The text was updated successfully, but these errors were encountered:

paul-gauthier · 2024-07-01T21:31:50Z

Thanks for trying aider and filing this issue.

Aider should have retried that error a bunch of times before finally giving up?

Aider doesn't use the litellm proxy, just the python library. And I don't know what the proxy would do if the client exceeds the rate limit? Probably just return a rate limit error just like google is?

usmanovbf · 2024-07-02T17:31:34Z

Aider should have retried that error a bunch of times before finally giving up?

Unfortunately, it stucks with the error

Unexpected error: litellm.RateLimitError: litellm.RateLimitError: VertexAIException - b'{\n  "error": {\n    "code": 429,\n    "message": "Resource has been exhausted (e.g. check quota).",\n    "status":
"RESOURCE_EXHAUSTED"\n  }\n}\n'

Or sometimes with the error like

Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.10/site-packages/litellm/llms/vertex_httpx.py", line 1122, in completion
    response.raise_for_status()
  File "/opt/homebrew/lib/python3.10/site-packages/httpx/_models.py", line 761, in raise_for_status
    raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Server error '500 Internal Server Error' for url 'https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-pro-latest:generateContent?key=*MASKED*'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.10/site-packages/litellm/main.py", line 1942, in completion
    response = vertex_chat_completion.completion(  # type: ignore
  File "/opt/homebrew/lib/python3.10/site-packages/litellm/llms/vertex_httpx.py", line 1125, in completion
    raise VertexAIError(status_code=error_code, message=response.text)
litellm.llms.vertex_httpx.VertexAIError: {
  "error": {
    "code": 500,
    "message": "An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting",
    "status": "INTERNAL"
  }
}


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/homebrew/bin/aider", line 8, in <module>
    sys.exit(main())
  File "/opt/homebrew/lib/python3.10/site-packages/aider/main.py", line 539, in main
    coder.run()
  File "/opt/homebrew/lib/python3.10/site-packages/aider/coders/base_coder.py", line 612, in run
    list(self.send_new_user_message(new_user_message))
  File "/opt/homebrew/lib/python3.10/site-packages/aider/coders/base_coder.py", line 917, in send_new_user_message
    saved_message = self.auto_commit(edited)
  File "/opt/homebrew/lib/python3.10/site-packages/aider/coders/base_coder.py", line 1440, in auto_commit
    res = self.repo.commit(fnames=edited, context=context, aider_edits=True)
  File "/opt/homebrew/lib/python3.10/site-packages/aider/repo.py", line 87, in commit
    commit_message = self.get_commit_message(diffs, context)
  File "/opt/homebrew/lib/python3.10/site-packages/aider/repo.py", line 163, in get_commit_message
    commit_message = simple_send_with_retries(model.name, messages)
  File "/opt/homebrew/lib/python3.10/site-packages/aider/sendchat.py", line 81, in simple_send_with_retries
    _hash, response = send_with_retries(
  File "/opt/homebrew/lib/python3.10/site-packages/backoff/_sync.py", line 105, in retry
    ret = target(*args, **kwargs)
  File "/opt/homebrew/lib/python3.10/site-packages/aider/sendchat.py", line 71, in send_with_retries
    res = litellm.completion(**kwargs)
  File "/opt/homebrew/lib/python3.10/site-packages/litellm/utils.py", line 959, in wrapper
    raise e
  File "/opt/homebrew/lib/python3.10/site-packages/litellm/utils.py", line 843, in wrapper
    result = original_function(*args, **kwargs)
  File "/opt/homebrew/lib/python3.10/site-packages/litellm/main.py", line 2607, in completion
    raise exception_type(
  File "/opt/homebrew/lib/python3.10/site-packages/litellm/utils.py", line 7586, in exception_type
    raise e
  File "/opt/homebrew/lib/python3.10/site-packages/litellm/utils.py", line 6665, in exception_type
    raise litellm.InternalServerError(
litellm.exceptions.InternalServerError: litellm.InternalServerError: VertexAIException InternalServerError - {
  "error": {
    "code": 500,
    "message": "An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting",
    "status": "INTERNAL"
  }
}

and I need to press Ctrl+C two times and re-run aider.

Perhaps, moving to litellm or just adding retry mechanism after certain point of time would help it.

And I don't know what the proxy would do if the client exceeds the rate limit?

It should just wait for some time to make the request again based on the RPM from the specific model quota and just using exponential backoff.

Probably just return a rate limit error just like google is?

It obligates to re-run aider manually again and again and it's not suitable. Better to have auto-retry based on the allowed RPM for the model or just based on the exponential backoff.

paul-gauthier · 2024-07-02T22:03:07Z

Aider does retry litellm.RateLimitError. If all the retries fail, only then does it report the error to the user.

paul-gauthier · 2024-07-04T11:26:54Z

I'm going to close this issue for now, but feel free to add a comment here and I will re-open or file a new issue any time.

paul-gauthier added the question Further information is requested label Jul 1, 2024

paul-gauthier closed this as completed Jul 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Configure RPM for specific models #764

Feature: Configure RPM for specific models #764

usmanovbf commented Jul 1, 2024

paul-gauthier commented Jul 1, 2024

usmanovbf commented Jul 2, 2024 •

edited

Loading

paul-gauthier commented Jul 2, 2024

paul-gauthier commented Jul 4, 2024

Feature: Configure RPM for specific models #764

Feature: Configure RPM for specific models #764

Comments

usmanovbf commented Jul 1, 2024

Issue

Version and model info

paul-gauthier commented Jul 1, 2024

usmanovbf commented Jul 2, 2024 • edited Loading

paul-gauthier commented Jul 2, 2024

paul-gauthier commented Jul 4, 2024

usmanovbf commented Jul 2, 2024 •

edited

Loading