-
Notifications
You must be signed in to change notification settings - Fork 5
Scheduler
Nuntium has a cron-like scheduler service which enqueues recurrent tasks in the job queue for later processing by the nuntium worker service, as processing capacity becomes available.
The scheduler service is a ruby script that runs every X seconds (ideally 30) and checks for tasks pending execution. Data regarding these tasks and their execution intervals and times are stored in a `cron_tasks` table.
The scheduler requires a visible endpoint in the queue service in order to be able to enqueue tasks instances whenever necessary.
Most AT channels are candidates to be implemented as tasks to be scheduled. Interfaces that require an active polling by nuntium may be cronned as well.
Some examples includes:
- Twitter incoming messages
- POP3 mail download
- QST push/pull
The table containing the tasks data must have the following fields:
- guid: unique identifier for the task
- interval: ideal timespan between two consecutive executions
- next run: when the next run should be enqueued
- last run: when the last run was executed
There must also be a way for the task creator to identify the task. For example, when a task is created for a POP3 channel, the task must be deleted if the channel is deleted.
Every time the scheduler runs, it checks the table for all tasks with a next run time less than or equals the current time. For each of these tasks, it enqueues a job for the nuntium worker and sets `nextRun = currentTime + interval`. Note that all tasks are wrapped with a cron task that interacts back with the cron service upon execution.
Whenever a task is executed, it must check its corresponding lastRun entry in the cron table. If `currentTime < lastRun + interval - tolerance` then the task is dropped and this event logged.
This has the effect of ensuring that two consecutive tasks will not be executed in less than `interval - tolerance` time.
All tasks should be timeboxed to `interval / 2` to ensure there will be no overlapping (recall that last run time is set when the task starts, not when it finishes). This also prevents starvation in the job queue. Every task should check frequently if its time quota has not been exceeded, and if it has, terminate its operations in a clean way so its next instantiation will proceed from that point.
Standard execution flow for task A with 1 minute interval and cron every 30 seconds. This is the expected behaviour of the system, with little congestion in the queue so the timespan between enqueue and run is small. Note that at 1:32 A2 executes since 57 seconds falls within `interval - tolerance`.
MM:SS | Event | nextRun | lastRun |
---|---|---|---|
00:00 | Initial state | 00:30 | - |
00:30 | Cron enqueues A1 | 01:30 | - |
00:35 | A1 begins execution | 01:30 | 00:35 |
00:45 | A1 ends execution | 01:30 | 00:35 |
01:00 | Cron executes | 01:30 | 00:35 |
01:30 | Cron enqueues A2 | 02:30 | 00:35 |
01:32 | A2 begins execution | 02:30 | 00:35 |
In this example the queue is rather full, being over a minute delay between enqueue and execution. Note how the difference `02:37 - 01:35` is kept so no tasks are dropped and A2 executes.
MM:SS | Event | nextRun | lastRun |
---|---|---|---|
00:00 | Initial state | 00:30 | - |
00:30 | Cron enqueues A1 | 01:30 | - |
01:00 | Cron executes | 01:30 | - |
01:30 | Cron enqueues A2 | 02:30 | - |
01:35 | A1 begins execution | 02:30 | 01:35 |
01:38 | A1 ends execution | 02:30 | 01:35 |
02:00 | Cron executes | 02:30 | 01:35 |
02:30 | Cron enqueues A3 | 03:30 | 01:35 |
02:37 | A2 begins execution | 03:30 | 01:35 |
Now suppose A1 takes a long time before actually starting execution, while A2 does not. In this scenario A2 will be dropped since `02:03 - 01:35` is small, and A3 will be executed later.
MM:SS | Event | nextRun | lastRun |
---|---|---|---|
00:00 | Initial state | 00:30 | - |
00:30 | Cron enqueues A1 | 01:30 | - |
01:00 | Cron executes | 01:30 | - |
01:30 | Cron enqueues A2 | 02:30 | - |
01:35 | A1 begins execution | 02:30 | 01:35 |
01:38 | A1 ends execution | 02:30 | 01:35 |
02:00 | Cron executes | 02:30 | 01:35 |
02:03 | A2 is dropped | 02:30 | 01:35 |
02:30 | Cron enqueues A3 | 03:30 | 01:35 |
02:56 | A3 executes | 03:30 | 02:56 |
This example is exactly the same as the first one but A1 consumes all its running quota and must terminate. Note that this does not affect A2 starting time, as last run is set upon execution begin.
MM:SS | Event | nextRun | lastRun |
---|---|---|---|
00:00 | Initial state | 00:30 | - |
00:30 | Cron enqueues A1 | 01:30 | - |
00:35 | A1 begins execution | 01:30 | 00:35 |
01:00 | Cron executes | 01:30 | 00:35 |
01:05 | A1 forced to end | 01:30 | 00:35 |
01:30 | Cron enqueues A2 | 02:30 | 00:35 |
01:32 | A2 begins execution | 02:30 | 00:35 |