Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Organizing different jobs including concurrency #258

Open
a4xrbj1 opened this issue Oct 3, 2017 · 2 comments
Open

Organizing different jobs including concurrency #258

a4xrbj1 opened this issue Oct 3, 2017 · 2 comments
Labels

Comments

@a4xrbj1
Copy link

a4xrbj1 commented Oct 3, 2017

Hi Vaughn,

I have the following requirement, hope you have an answer for this question:

A) We do have several different services (meaning different web services, which we call via their API) and for each service we have several jobs.

B) On top of that we do a need concurrent number of jobs of the same service (with let's say a maximum of 5 concurrent jobs). But this concurrent jobs should only be executed if it's for a different user.

For requirement A), would you use a different job queue for each service or how would organize that jobs for the same service are all lined up (queued). So something like jobQueueServiceA, jobQueueServiceB etc?

For requirement B) we can set the concurrency to 5 (jobs) but how can we at the same time control that two jobs for the same user aren't executed in parallel? Would it be best to chain them up, meaning when we create the jobs we do check if there is already a job for that userId and we would use the depends field to wait for the last job to finish?

As always, thanks in advance!

@vsivsi vsivsi added the question label Oct 4, 2017
@vsivsi
Copy link
Owner

vsivsi commented Oct 4, 2017

Hi, in general this is a synchronization problem. That is, you have multiple concurrent processes potentially competing for an exclusive resource (in this case, the "right" to access an API on behalf of a user).

In general, job-collection (and job schedulers as a class) do not supply synchronization primitives. This is because distributed synchronization is a hard problem, and every application has its own constraints.

But for simple cases, you can fake it by only running one queue and setting concurrency to 1. Then you know that only one job at a time is running.

Your constraint "B" above adds an additional wrinkle, but it can perhaps also be "faked" by setting the cargo payload parameter of your job-queue to equal 5, and then writing a small amount of worker code to check for duplicate users in the 1-5 jobs received, and if it detects duplicate user requests, to either fail any jobs > 1 (for a single user), or serialize them in the worker code so that they don't conflict.

I don't know enough about to requirements to make a hard recommendation here, but something like the above may help you avoid needing to use something more elaborate like atomic updates to a distributed database for sync. If you do end up needing something like that, I have a locks package for MongoDB that works well (and is used heavily by my other Meteor package file-collection). https://github.com/vsivsi/gridfs-locks

@vsivsi
Copy link
Owner

vsivsi commented Oct 4, 2017

Corrected "cargo" to be "payload" in the above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants