Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another way. #3

Open
seuros opened this issue May 22, 2014 · 8 comments
Open

Another way. #3

seuros opened this issue May 22, 2014 · 8 comments

Comments

@seuros
Copy link
Member

seuros commented May 22, 2014

We have

class UserSyncer
  include Sidekiq::ManagerWorker

  sidekiq_delegate_task_to :user_task_worker # or UserTaskWorker
  sidekiq_manager_options :batch_size => 500,
                          :identifier_key => :user_token,
                          :additional_keys => [:status]
end

UserSyncer.perform_query_async(User.active, :batch_size => 300)

In this example, we are not performing the query async. We have to find another way.
In fact, we are not even using Sidekiq.

@yelled3
Copy link
Contributor

yelled3 commented May 22, 2014

I see you're point.

the _async part is celleloid.
so all we need to do is rename the method name to perform_query and then call it like u said:
UserSyncer.perform_query_async(User.active, :batch_size => 300).

I beleive that's the way it's done with regular Worker class.

@seuros
Copy link
Member Author

seuros commented May 22, 2014

No, that not my point.
if you do

UserSyncer.perform_query_async(User.active, :batch_size => 300)

User.active.find_in_batch(batch_size:300) {|user| UserTaskWorker.perform_async(user.id) }

UserSyncer is litterally useless, since it run everything on the foreground

@yelled3
Copy link
Contributor

yelled3 commented May 23, 2014

right, I see what you're saying.
User.active returns an activerecord relation, I wonder how performant it is to serialise it...
we can aways serialise 'User.active' as a string, and then eval it later on.
any ideas?

@seuros
Copy link
Member Author

seuros commented May 23, 2014

I was thinking about User.active.pluck(:id) but we should avoid hitting the database

@yelled3
Copy link
Contributor

yelled3 commented May 23, 2014

no no... that doesn't solves the issue + I want the select(:id) to be part of the class logic.

@yelled3
Copy link
Contributor

yelled3 commented May 23, 2014

User.active can a be HUGE set of records. that's why I use find_in_batches.

@seuros
Copy link
Member Author

seuros commented May 23, 2014

Plucking return an array.
Select return active record relation.

So plucking is much faster than select(:id), but still both are bad.

I tested with User.active.to_sql then stripped the first part and put it back in a where(''), but i call it @seuros's bullshit programming.

But so far it the fastest approach.

@yelled3
Copy link
Contributor

yelled3 commented May 23, 2014

@seuros i'm starting to think ManagerWorker shouldn't be running async at all.
think about it;

if you want to run UserSyncer.perform_query_async(User.active, :batch_size => 300)
from say, a controller (or somewhere else, where you don't want to block IO)
at the time you enqueue the job to the time the actual worker runs the query -
User.active can change and be very different.

more over, the way I currently use ManagerWorkers is calling thing from whenever:

every 1.day do
   runner 'UserSyncer.perform_query_async(User.active, :batch_size => 300)'
end

so it's not an issue.
maybe we should guide developers to use this approach or wrap in another runner worker:

class UserManagerRunnerWorker
  include Sidekiq::Worker
  def perform
     UserSyncer.perform_query_async(User.active, :batch_size => 300)
  end
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants