Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show when scheduled jobs are scheduled to be run. #60

Open
proby opened this issue Dec 4, 2012 · 10 comments
Open

Show when scheduled jobs are scheduled to be run. #60

proby opened this issue Dec 4, 2012 · 10 comments

Comments

@proby
Copy link
Contributor

proby commented Dec 4, 2012

It'd be super helpful and handy to show when a job is scheduled to be run whenever looking at a scheduled job.

@wr0ngway
Copy link
Contributor

+1
This seem like a hole in the api, having it would make it easier to assert that something got scheduled at a specific time in a test case.

@dlecocq
Copy link
Contributor

dlecocq commented Sep 18, 2013

It is a hole, but one we can fix. As a note to myself:

@StephenOTT
Copy link

👍

@StephenOTT
Copy link

@dlecocq I just threw this together real quick. Have not tested it yet. But just curious on your thoughts on the style and placement.

-- Get all the attributes of this particular job
function QlessRecurringJob:data()
  local job = redis.call(
    'hmget', 'ql:r:' .. self.jid, 'jid', 'klass', 'state', 'queue',
    'priority', 'interval', 'retries', 'count', 'data', 'tags', 'backlog')


local jobScheduleDate = nil
if job[3] == "scheduled" then
 jobScheduleDate = redis.call(
    'get', 'ql:q:' .. job[4] .. '-scheduled', self.jid )
end


  if not job[1] then
    return nil
  end

  return {
    jid          = job[1],
    klass        = job[2],
    state        = job[3],
    queue        = job[4],
    priority     = tonumber(job[5]),
    interval     = tonumber(job[6]),
    retries      = tonumber(job[7]),
    count        = tonumber(job[8]),
    data         = job[9],
    tags         = cjson.decode(job[10]),
    backlog      = tonumber(job[11] or 0),
    scheduledfor = tonumber(jobScheduleDate)
  }
end

OR

just:

.......
  return {
    jid          = job[1],
    klass        = job[2],
    state        = job[3],
    queue        = job[4],
    priority     = tonumber(job[5]),
    interval     = tonumber(job[6]),
    retries      = tonumber(job[7]),
    count        = tonumber(job[8]),
    data         = job[9],
    tags         = cjson.decode(job[10]),
    backlog      = tonumber(job[11] or 0),
    scheduledfor = tonumber(redis.call(
                        'get', 'ql:q:' .. job[4] .. '-scheduled', self.jid ))
  }

Original set of code was: https://github.com/seomoz/qless-core/blob/521adbe59a6649e01f3349297cfa69e3af4d6f6e/recurring.lua#L1-L24

@StephenOTT
Copy link

nvm thats clearly going to need some more work now that i look at it some more.

@StephenOTT
Copy link

@dlecocq I have been looking at the data model that is created in Redis for scheduled data.

What was your thinking for placing all scheduled jobs in a single key for a single queue?
As I was thinking about the query needed to return the specific ZSET score for the job id value in the scheduled queue, I keep thinking about performance limits for the number of jobs that are scheduled.

The current configuration makes me think the only way to get the specific value (mid) in the ZSET is to return all ZSET items and do an in-memory search. This works for small groups. But just curious about the number of "scheduled" jobs you were imagining would be stored at one time in a single queue? 100s, 1000s, 10,000s, 100,000, 1,000,000s?

I see the reasoning for returning using a ZSET for the auto sorting as a regular key value pair, so just curious about thinking about performance.

Thanks!

@StephenOTT
Copy link

My other thought is you could use the fairly new ZSCAN feature:

redis.zscan("ql:q:testing-scheduled", 0, {match: "7f81bbe64bcd4599b565c95c817cf363"})

This works in the current structure. It would be a little slow in the future with large counts. But better than bring back everything.

returned is:

7f81bbe64bcd4599b565c95c817cf363
1400328474.6609

@dlecocq
Copy link
Contributor

dlecocq commented May 8, 2014

Redis sorted sets are fast enough, and conservative benchmarks indicate a possible throughput of about 10-100M put-pops per day on a single Redis instance. Of course, that benchmark was based on a workflow where jobs don't accumulate in huge quantities in a queue. That said, we've also had legitimate instances of 100k-1M jobs in a queue in production relatively comfortably, though that's not our typical use. We thought at length about what the right data structure was for job scheduling, and when balancing job priority, scheduling, etc., a ZSET made the task very easy while still being performant.

As far as determining a jid's score, ZSCORE is O(1) and is used in queue.lua to determine a job's score

@StephenOTT
Copy link

Okay I have created a PR for review. See:
#187
seomoz/qless-core#48

@stephenreay
Copy link

It seems this hasn't progressed in 5 years, despite needing only minor changes? If I create an updated PR based on this work + discussed changes, can someone get this merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants