Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AI RLA: Support for multiple limits and window sizes #8530

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions app/_hub/kong-inc/ai-rate-limiting-advanced/how-to/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,26 @@ curl -i -X POST \

Protect your LLM service with rate limiting. It will analyze query costs and token response to provide an enterprise-grade rate limiting strategy.

{% if_version lte:3.9.x %}
```sh
curl -i -X POST http://localhost:8001/services/example-service/plugins \
--data 'name=ai-rate-limiting-advanced' \
--data 'config.llm_providers[1].name=openai' \
--data 'config.llm_providers[1].limit=100' \
--data 'config.llm_providers[1].window_size=3600'
```
{% endif_version %}
{% if_version gte:3.10.x %}
```sh
curl -i -X POST http://localhost:8001/services/example-service/plugins \
--data 'name=ai-rate-limiting-advanced' \
--data 'config.llm_providers[1].name=openai' \
--data 'config.llm_providers[1].limit[]=100' \
--data 'config.llm_providers[1].limit[]=10000' \
--data 'config.llm_providers[1].window_size[]=60 \
--data 'config.llm_providers[1].window_size[]=3600'
```
{% endif_version %}

The AI Rate Limiting Advanced plugin supports threes rate limiting strategies. The default strategy will estimate cost on queries by counting the total token value returned in the LLM responses.

Expand Down
Loading