-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High CPU consumption #2
Comments
Interestingly, once queue is empty, CPU doesn't spike. So sidekiq is implementing some sort of sleep algo, when queue is empty. |
First: sorry for the interminably slow response. Second: It's a good question. It's definitely entirely on sidekiq-rate-limiter, as it does pop things off the queue, check the limit, & then put them back. I'll give it some thought. There may be a better way, though I think your observation that sleeping won't work is spot on since the utility is in the limiting being configurable per job / worker. |
I’ve thought about this a little, and I wonder if adopting an element of how https://github.com/gevans/sidekiq-throttler works could solve this. Rather than rate-limiting by pushing the job back onto the “live” queue (which Sidekiq pulls from as quickly as it can), we could look at pushing rate-limited queues back as scheduled jobs (which Sidekiq only checks every 15 seconds), which would show up in the Delayed count on the Sidekiq web UI. Thoughts? |
So, this gem came out of my desire to avoid that solution actually :) What happens there is they keep getting pushed into scheduled, but for whatever your period is. So if you want to use a short period (say, 1 minute) it schedules all your jobs past the limit for 1 minute later. So you wind up with the same high cpu usage, the only difference is that your stats wind up in the millions for any sufficiently high amount of traffic. I think the best solution would be to actually queue them to a separate queue that's transparent to any clients, & then using some utility algorithm move them onto the actual queue so that there's always a little more work than you want. Perhaps it would be advantageous to support multiple throttling strategies which could be configured? |
For the inflated statistics, I believe that because we’d be putting the jobs into the scheduled queue in our custom fetcher, rather than doing it in a Sidekiq middleware when the jobs are run (which is what sidekiq-throttler does), we’d avoid this issue. Sidekiq would not be counting throttled jobs as done, because they would never “officially” be fetched and run. With a large number of rate-limited jobs, the CPU usage would still be high, yes, but it would only occur in bursts, occurring each time the jobs came off the scheduled job queue (i.e. once per rate interval). Even with a short interval (e.g. 1 minute), that would mean just a burst of CPU usage every 1 minute, rather than it being continuous usage. The only downside I see to this approach is that Sidekiq polls the scheduled jobs queue only once every 15 seconds, so rate intervals on the order of a minute or less would suffer from some significant latency (up to 15 seconds) while Sidekiq gets around to pulling them off the scheduled job queue. For me, 15 seconds give or take is not significant, but it might be in some applications. Am I missing anything else? |
@sentience you are missing sidekiq-throttler. @bwthomas I believe the whole concept this gem is using is wrong. I'd like to have an ability to rate limit queues rather than jobs. Once the limit reached we simply need to stop monitoring the queue until the lock expires. There's a I was thinking we can track the number of jobs performed for a certain queue per the given period of time (limit). When the limit for a queue has been reached we need to temporarily remove it from the Additionally we'll have to restart the fetcher that is already waiting for new work with BRPOP, after removing/re-adding queues from/to UPD: I was hoping there's a blocking |
It may not be ideal @heaven, but don't forget that my goal is to rate limit because of constraints imposed by the vendor's api. How we get there makes me no difference. I still think the best solution is to offer multiple pluggable throttling strategies. Pull requests welcome :) |
100% CPU usage is a huge difference and there's no way to avoid Sidekiq from popping jobs from exhausted queues unless you explicitly tell it to not to. |
The same usage was happening with sidekiq-throttler, just through the scheduled queue. Still, that's not good. So, I reiterate: pull requests welcome. |
When queue is banked up with no other jobs except throttled ones, CPU spikes to 99%. My guess is fetcher keeps on popping throttled jobs off the queue, but only to push back on since limit is reached.
We can't simply sleep until next rate limit cycle begins, since other non-throttled job can be queued and processed.
Any thought on how to prevent CPU spike as described?
The text was updated successfully, but these errors were encountered: