Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spurious Sidekiq::OverLimit exceptions #3037

Open
97jaz opened this issue Jan 28, 2025 · 1 comment
Open

Spurious Sidekiq::OverLimit exceptions #3037

97jaz opened this issue Jan 28, 2025 · 1 comment
Labels
bug community To tag external issues and PRs submitted by the community

Comments

@97jaz
Copy link

97jaz commented Jan 28, 2025

Description

This is really the same issue that I raised years back with Appsignal: appsignal/appsignal-ruby#526

The concurrent rate limiters in sidekiq-enterprise raise Sidekiq::OverLimit when a lock cannot be acquired within a certain amount of time. (See https://github.com/sidekiq/sidekiq/wiki/Ent-Rate-Limiting#concurrent for details.) Sidekq's own middleware rescues this exception and re-enqueues the job to be run later -- unless the job has exceeded its maximum number of retries, in which case the OverLimit exception is allowed to escape the middleware.

When Sidekiq re-enqueues the job in response to an OverLimit, this isn't an error, and we'd prefer it not show up as one in newrelic. On the other hand, if the exception escapes Sidekiq's middleware (i.e., the number of retries has been exceeded), we'd like to see it.

As with appsignal, the behavior would seem to be a function of the order of sidekiq's middleware vis-a-vis newrelic's. We could reorder the middleware chain on our end, but it would be a good idea for newrelic to ensure that its middleware sits earlier in the chain than sidekiq's own.

Expected Behavior

When Sidekiq::OverLimit is raised by a rate limiter, unless Sidekiq's own middleware re-raises the exception, it should not show up in newrelic as an error.

Steps to Reproduce

To reproduce, you need to create contention for a sidekiq concurrent rate limiter. You could create a concurrent Sidekiq::Limiter with concurrency 1 (i.e., a mutex) and a short wait_timeout. Have two processes/threads run concurrently, trying to acquire the lock and, with it, sleep for longer than the timeout. The one that loses the race to acquire the lock should raise Sidekiq::OverLimit.

Your Environment

newrelic_rpm 9.16.1
ruby 3.3.6
rails 8.0.1
sidekiq-ent 7.3.4

For Maintainers Only or Hero Triaging this bug

Suggested Priority (P1,P2,P3,P4,P5):
Suggested T-Shirt size (S, M, L, XL, Unknown):

@97jaz 97jaz added the bug label Jan 28, 2025
@workato-integration
Copy link

@github-actions github-actions bot added the community To tag external issues and PRs submitted by the community label Jan 28, 2025
@chynesNR chynesNR moved this to In Sprint in Ruby Engineering Board Jan 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug community To tag external issues and PRs submitted by the community
Projects
Status: In Sprint
Development

No branches or pull requests

1 participant