Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More explicit error message for errro 429 QUOTA_EXCEEDED/TOO_MANY_REQUESTS #372

Closed
gregory1g opened this issue Jan 9, 2025 · 10 comments · Fixed by #390
Closed

More explicit error message for errro 429 QUOTA_EXCEEDED/TOO_MANY_REQUESTS #372

gregory1g opened this issue Jan 9, 2025 · 10 comments · Fixed by #390
Labels
correction correction in documentation Spring25

Comments

@gregory1g
Copy link

Problem description
https://github.com/camaraproject/Commonalities/blob/main/documentation/API-design-guidelines.md#61-standardized-use-of-camara-error-responses introduces two options for error 429 - QUOTA_EXCEEDED and TOO_MANY_REQUESTS:

429 QUOTA_EXCEEDED Either out of resource quota or reaching rate limiting.
429 TOO_MANY_REQUESTS Either out of resource quota or reaching rate limiting.

The problem is: for different error code identical error message is used. As a result, the human readable error message do not explicitly addresses the issue while the server "knows" if which of two mentioned reasons caused the error.

Expected behavior
Explicit error messages, for example:

429 QUOTA_EXCEEDED Out of resource quota.
429 TOO_MANY_REQUESTS Rate limits are reached.
@Kevsy
Copy link
Collaborator

Kevsy commented Jan 16, 2025

Although the status code name ' Too many requests' sems to cover both scenarios, RFC 6585 defines 429 Too Many Requests exclusively for rate limiting:

The 429 status code indicates that the user has sent too many
requests in a given amount of time ("rate limiting").

This implies that the API Consumer can retry once the given amount of time has expired, and the RFC supports a Retry-After response header to indicate this.

Quota Exceeded/Out of resource quota is a distinct scenario, in which "requests-per-time metric" is not a factor. In this case the API Consumer has a quota (e.g. a bundle of 1000 requests), they have used it all up, and a process will need to be followed to 'top-up' that quota. I believe that can be covered by 403 Forbidden as defined in RFC 9110 :

403 Forbidden
The 403 (Forbidden) status code indicates that the server understood the request but refuses to fulfill it. A server that wishes to make public why the request has been forbidden can describe that reason in the response content (if any).

@rartych
Copy link
Collaborator

rartych commented Jan 20, 2025

@Kevsy Thank you

According to my search the most commonly used HTTP status code for quota exceeded is 429 Too Many Requests.
In some specific cases, particularly involving storage or resource quotas, the 403 Forbidden status code might also be used.

So we have 2 options:

429 QUOTA_EXCEEDED Out of resource quota
429 TOO_MANY_REQUESTS Rate limits are reached
403 QUOTA_EXCEEDED Out of resource quota
429 TOO_MANY_REQUESTS Rate limits are reached

@sachinvodafone
Copy link
Collaborator

If our specification prioritizes strict RFC compliance then adopting 403 for quota exhaustion and reserving 429 solely for rate limiting would be a good choice. However , I believe , many cloud companies, currently use 429 for both the scenarios like described here

@Kevsy
Copy link
Collaborator

Kevsy commented Jan 21, 2025

If our specification prioritizes strict RFC compliance then adopting 403 for quota exhaustion and reserving 429 solely for rate limiting would be a good choice. However , I believe , many cloud companies, currently use 429 for both the scenarios like described here

Thanks @sachin - the AWS case above is two facets of rate-limiting only (steady-state and burst) but we have a third case to account for: quota exhaustion, e.g. an API Consumer bought a bundle of 10000 requests so the 10001th request will not be fulfilled for business reasons (rather than network/compute constraints).

Hence to account for those discrete cases, the suggestion to use the following codes (as above):
Exceeded requests per second = 429 Too Many Requests. Any detail can be provided in the message, e.g. whether steady state or burst was exceeded. .
Quota exceeded = 403 Forbidden

@sachinvodafone
Copy link
Collaborator

Hi @Kevsy, In the given documentation, the term 'Quota' is mentioned in the initial few lines, which led me to interpret that it is implicitly referencing 'quota' in the context of 429. Additionally, we have a few more references related to GCP that support this interpretation.

It states that while 403 could be used, the recommended approach is to use 429.

https://cloud.google.com/armor/docs/rate-limiting-overview#:~:text=exceed_action%3A%20When%20a,Requests)%20response%20code.

Another Reference which saying 429 applicable for 'Pay as you go' model same like 'quota exhaustion'
https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429

In given example, issue has been discussing related to quota exhaustion:
https://www.googlecloudcommunity.com/gc/AI-ML/Title-Error-429-Exceeding-Quota-Limit-When-Creating-a-Model/m-p/647191

However, as mentioned earlier, it is up to us to decide whether to use 403 or 429, as both are applicable in this context.

@PedroDiez
Copy link
Collaborator

Hi all,

We are ok with the main point which is to separate the scope of QUOTA_EXCEEDED and TOO_MANY_REQUESTS so as the description of error codes is adaptaed as aforementioned.

Regarding the "status" value for QUOTA_EXCEEDED (403 or 429), both work for us.

From a RFC point of view 403 seems more accurate as per Kevin comment. And also as per Sachin comments it seems some providers deals with the scenario with the use of 429 and some API definitions has already 429.

What we think is relevant is just to take the decision within Spring25 and move forward with it in order to not change view later and generate new breaking changes in the future.

@Kevsy
Copy link
Collaborator

Kevsy commented Jan 21, 2025

HI @sachinvodafone,

Hi @Kevsy, In the given documentation, the term 'Quota' is mentioned in the initial few lines, which led me to interpret that it is implicitly referencing 'quota' in the context of 429. Additionally, we have a few more references related to GCP that support this interpretation.

It states that while 403 could be used, the recommended approach is to use 429.

https://cloud.google.com/armor/docs/rate-limiting-overview#:~:text=exceed_action%3A%20When%20a,Requests)%20response%20code.

^ That's talking specifically about rate-limiting, not an absolute limit : it states "When a request exceeds the rate_limit_threshold_count"

Another Reference which saying 429 applicable for 'Pay as you go' model same like 'quota exhaustion' https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429

^ That states "If the number of your requests exceeds the capacity allocated to process requests,", in other words, the server cannot fulfil the request because it doesn't have the resources. That error could feasibly occur within the quota limit, and should probably have been a 5xx error . Whereas I mean the distinct error condition of attempting to make another request when you have already used up you bundle quota of 10000 requests.

In given example, issue has been discussing related to quota exhaustion: https://www.googlecloudcommunity.com/gc/AI-ML/Title-Error-429-Exceeding-Quota-Limit-When-Creating-a-Model/m-p/647191

^ Given the very clear RFC definition of 403 as being exclusively for rate-limiting, I have no idea why it would be returned in that scenario.

However, as mentioned earlier, it is up to us to decide whether to use 403 or 429, as both are applicable in this context.

429 is defined as exclusively applicable for rate-limiting, but we need something else for the scenario when an API Consumer simply uses up their bundle quota - and I think 403 (with appropriate message) is the best fit.

@rartych
Copy link
Collaborator

rartych commented Jan 22, 2025

It looks that we can find examples of both error codes (403 and 429) used for Quota Exceeded case.
One can say that 403 deals with permissions not the business logic, other that 429 is for rate limiting only. So both codes are not perfect fit.

Therefore I propose pragmatic approach: since 429 QUOTA_EXCEEDED is used in multiple CAMARA API definitions, let's keep it and improve the message example and description.
See: Error 429 analysis

P.S.
OpenAI API, which is trendy among developers now, is using 429 for the cases: rate limit, quota (credits/budget) and sever overload - so this approach (429) shouldn't be confusing for them.

@Kevsy
Copy link
Collaborator

Kevsy commented Jan 22, 2025

OpenAI API, which is trendy among developers now, is using 429 for the cases: rate limit, quota (credits/budget) and sever overload - so this approach (429) shouldn't be confusing for them.

Fair point - and indeed it is good to know what is the de facto standard (or at least what is trendy 😎)

But note that Vonage use 403 for this scenario....

Image

@rartych
Copy link
Collaborator

rartych commented Jan 22, 2025

Following pragmatic approach I have prepared PR #390 - Please review it, as we need to close RC.1 asap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
correction correction in documentation Spring25
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants