add optional concurrent max queue limit: #2940

ikreymer · 2025-10-28T19:37:22Z

set 'max_concur_queue_to_limit_scale' to determine max size of concurrent crawls that can be queued
if set, above limit queueing new crawls will be rejected with a 429 until there are less concurrent crawls in the queue
default to disabled tests: add max queue limit to concurrent crawl tests
fixes [Task]: Add a way to limit concurrent crawler queue #2938

Testing:

Set a lower concurrent crawl limit on an org, eg. 1
Set max_concur_queue_to_limit_scale: 1
Attempt to start another crawl after 1 is already waiting, should receive error / 429.

- set 'max_concur_queue_to_limit_scale' to determine max size of concurrent crawls that can be queued - if set, above limit queueing new crawls will be rejected with a 429 - default to disabled tests: add max queue limit to concurrent crawl tests

tw4l

In general working well! Tested locally with a limit scale multiplier of 1 and 2.

The nightly concurrent crawl limit test doesn't pass yet, should be tweaked a bit. I left a comment in-line.

I wonder if we also want to have a limit for the number of queued crawls even when a concurrent crawl quota isn't set? Not sure if self-deployments that don't use a concurrent crawl limit might still run into resourcing issues from starting too many crawls at the same time.

Only other comment is that I find max_concur_queue_to_limit_scale to be pretty cryptic. Without reading the comments in values.yaml I'm not sure I'd be able to piece together what it means. Something like crawl_queue_limit_scale might be a bit easier to read?

backend/btrixcloud/crawlconfigs.py

chart/values.yaml

backend/test_nightly/test_concurrent_crawl_limit.py

chart/values.yaml

ikreymer · 2025-10-30T02:25:04Z

I wonder if we also want to have a limit for the number of queued crawls even when a concurrent crawl quota isn't set? Not sure if self-deployments that don't use a concurrent crawl limit might still run into resourcing issues from starting too many crawls at the same time.

PR #2945 adds a separate optimization which should make the concurrent crawl check more efficient in general, even if there is no limit. I think that should avoid the main resourcing issue, and maybe makes this PR is less important, but still an option to have.

Co-authored-by: Tessa Walsh <[email protected]>

ikreymer requested a review from tw4l October 28, 2025 19:37

adjust test

0a45000

tw4l reviewed Oct 28, 2025

View reviewed changes

fix tests?

fc9cbb7

ikreymer mentioned this pull request Oct 30, 2025

Optimize Concurrent Crawl Check #2945

Open

ikreymer and others added 2 commits October 29, 2025 19:31

fix from review, rename param name, cleanup

7f653fe

Update chart/values.yaml

99f6a29

Co-authored-by: Tessa Walsh <[email protected]>

ikreymer requested a review from tw4l October 30, 2025 02:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

add optional concurrent max queue limit: #2940

add optional concurrent max queue limit: #2940

Uh oh!

ikreymer commented Oct 28, 2025 •

edited

Loading

Uh oh!

tw4l left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ikreymer commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

add optional concurrent max queue limit: #2940

Are you sure you want to change the base?

add optional concurrent max queue limit: #2940

Uh oh!

Conversation

ikreymer commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tw4l left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ikreymer commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ikreymer commented Oct 28, 2025 •

edited

Loading

tw4l left a comment •

edited

Loading