High CPU consumption #2

khoan · 2014-02-14T02:27:40Z

When queue is banked up with no other jobs except throttled ones, CPU spikes to 99%. My guess is fetcher keeps on popping throttled jobs off the queue, but only to push back on since limit is reached.

We can't simply sleep until next rate limit cycle begins, since other non-throttled job can be queued and processed.

Any thought on how to prevent CPU spike as described?

khoan · 2014-02-14T03:38:30Z

Interestingly, once queue is empty, CPU doesn't spike. So sidekiq is implementing some sort of sleep algo, when queue is empty.

bwthomas · 2014-04-01T17:18:29Z

First: sorry for the interminably slow response.

Second: It's a good question. It's definitely entirely on sidekiq-rate-limiter, as it does pop things off the queue, check the limit, & then put them back.

I'll give it some thought. There may be a better way, though I think your observation that sleeping won't work is spot on since the utility is in the limiting being configurable per job / worker.

sentience · 2014-05-28T06:45:57Z

I’ve thought about this a little, and I wonder if adopting an element of how https://github.com/gevans/sidekiq-throttler works could solve this. Rather than rate-limiting by pushing the job back onto the “live” queue (which Sidekiq pulls from as quickly as it can), we could look at pushing rate-limited queues back as scheduled jobs (which Sidekiq only checks every 15 seconds), which would show up in the Delayed count on the Sidekiq web UI.

Thoughts?

bwthomas · 2014-06-03T18:21:48Z

So, this gem came out of my desire to avoid that solution actually :) What happens there is they keep getting pushed into scheduled, but for whatever your period is. So if you want to use a short period (say, 1 minute) it schedules all your jobs past the limit for 1 minute later. So you wind up with the same high cpu usage, the only difference is that your stats wind up in the millions for any sufficiently high amount of traffic.

I think the best solution would be to actually queue them to a separate queue that's transparent to any clients, & then using some utility algorithm move them onto the actual queue so that there's always a little more work than you want.

Perhaps it would be advantageous to support multiple throttling strategies which could be configured?

sentience · 2014-06-04T09:37:25Z

For the inflated statistics, I believe that because we’d be putting the jobs into the scheduled queue in our custom fetcher, rather than doing it in a Sidekiq middleware when the jobs are run (which is what sidekiq-throttler does), we’d avoid this issue. Sidekiq would not be counting throttled jobs as done, because they would never “officially” be fetched and run.

With a large number of rate-limited jobs, the CPU usage would still be high, yes, but it would only occur in bursts, occurring each time the jobs came off the scheduled job queue (i.e. once per rate interval). Even with a short interval (e.g. 1 minute), that would mean just a burst of CPU usage every 1 minute, rather than it being continuous usage.

The only downside I see to this approach is that Sidekiq polls the scheduled jobs queue only once every 15 seconds, so rate intervals on the order of a minute or less would suffer from some significant latency (up to 15 seconds) while Sidekiq gets around to pulling them off the scheduled job queue. For me, 15 seconds give or take is not significant, but it might be in some applications.

Am I missing anything else?

heaven · 2016-07-16T17:54:42Z

@sentience you are missing sidekiq-throttler.

@bwthomas I believe the whole concept this gem is using is wrong. I'd like to have an ability to rate limit queues rather than jobs. Once the limit reached we simply need to stop monitoring the queue until the lock expires.

There's a @queues array on which Sidekiq is calling a blocking RPOP. So once a job has been finished Sidekiq calls BRPOP to fetch a new job from one of these queues. This operation is blocking and will hold the connection until there's a job available. This allows Sidekiq to avoid polling.

I was thinking we can track the number of jobs performed for a certain queue per the given period of time (limit). When the limit for a queue has been reached we need to temporarily remove it from the @queues array and store in Sidekiq separately, e.g. queues:exhausted:#{queue_name}, with the expiration time to release the lock. Also we need to start a polling thread which will check for this queue lock to expiry and return the queue back to the @queues array. This thread could be started just once during the startup, if there's at least one throttled queue.

Additionally we'll have to restart the fetcher that is already waiting for new work with BRPOP, after removing/re-adding queues from/to @queues array.

UPD: I was hoping there's a blocking KEY – BKEY, which will release the lock when the key expires, but it seems there's no such operation: redis/redis#2072 It would be really helpful in this case and allow us to add our exhausted queue to queues:exhausted:and run a BKEY on it, so we don't have to poll. A side effect of this would be that we'll have to start as many polling threads as many throttled queues we have, due to a locking nature of BKEY operation. I am not an Redis expert, the author of that issue told he found a way how to achieve the same result using existing features.

bwthomas · 2016-07-17T03:41:55Z

It may not be ideal @heaven, but don't forget that my goal is to rate limit because of constraints imposed by the vendor's api. How we get there makes me no difference.

I still think the best solution is to offer multiple pluggable throttling strategies. Pull requests welcome :)

heaven · 2016-07-17T04:00:15Z

100% CPU usage is a huge difference and there's no way to avoid Sidekiq from popping jobs from exhausted queues unless you explicitly tell it to not to.

bwthomas · 2016-07-17T21:48:09Z

The same usage was happening with sidekiq-throttler, just through the scheduled queue.

Still, that's not good. So, I reiterate: pull requests welcome.

bwthomas added the enhancement label Apr 1, 2014

woodhull mentioned this issue May 2, 2017

Pluggable fetch strategies #24

Open

cecilian mentioned this issue May 28, 2019

Huge CPU consumption #21

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High CPU consumption #2

High CPU consumption #2

khoan commented Feb 14, 2014

khoan commented Feb 14, 2014

bwthomas commented Apr 1, 2014

sentience commented May 28, 2014

bwthomas commented Jun 3, 2014

sentience commented Jun 4, 2014

heaven commented Jul 16, 2016 •

edited

Loading

bwthomas commented Jul 17, 2016

heaven commented Jul 17, 2016 •

edited

Loading

bwthomas commented Jul 17, 2016

High CPU consumption #2

High CPU consumption #2

Comments

khoan commented Feb 14, 2014

khoan commented Feb 14, 2014

bwthomas commented Apr 1, 2014

sentience commented May 28, 2014

bwthomas commented Jun 3, 2014

sentience commented Jun 4, 2014

heaven commented Jul 16, 2016 • edited Loading

bwthomas commented Jul 17, 2016

heaven commented Jul 17, 2016 • edited Loading

bwthomas commented Jul 17, 2016

heaven commented Jul 16, 2016 •

edited

Loading

heaven commented Jul 17, 2016 •

edited

Loading