-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reimplementation of previous repair, but without significant performance hit #130
Reimplementation of previous repair, but without significant performance hit #130
Conversation
lib/Minion/Backend/Pg.pm
Outdated
SELECT unnest(parents) AS id FROM minion_jobs WHERE state != 'finished' | ||
) | ||
UNION | ||
SELECT id FROM minion_jobs WHERE state = 'inactive' AND expires <= NOW() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you're deleting expired jobs twice now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an oversight, fixed in recent commit (including failed perltidy check)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure combining the two queries is more efficient here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For debugging alone two queries would seem much better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure combining the two queries is more efficient here?
I would think its always more efficient to combine it into a single query. But I would have to have a chat with my dba.
For debugging alone two queries would seem much better.
I agree here it would be better for debugging
The original cleanup of expired jobs, is fast by it self
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also concerned about the time locks are being kept on the table. Even if the large transaction is theoretically a tiny bit more efficient, multiple smaller deletes would be preferable if they allow for faster updates from other transactions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is reflected in the push from earlier today
Please squash commits. |
…arge amount of jobs
27aefba
to
0ab04c2
Compare
reading the new minion query, it isn't obvious to me, does it handle deeper dependencies |
Oh, it seems I misremembered the previous functionality. I had thought it would preserve the whole chain but the query does not appear to be so cb9182a |
I had the exact same question, so made a little example for my own sake, just to map it out. Job 1 creates 2, 3, 4 and sets itself as parent for those. It also creates job 5 which is to be run when 2, 3 and 4 completes. Something like this:
After running this is the state:
Then how I read the query you'd for SELECT id FROM minion_jobs WHERE state = 'finished' AND finished <= NOW() - INTERVAL '1 second' * ? get EXCEPT SELECT unnest(parents) AS id FROM minion_jobs WHERE state != 'finished' you'd get What I have not verified is what happens when the chain is "deeper". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me this looks good and quite a nice feature to have. I'll leave it to someone else to make a decision to merge though.
👍
This is correct, A would be removed in this case. But this is how the previous query worked. Main focus was to restore previous functionality. |
You would actually need to have something like
In order to preserve the flow, atleast until 5 has finished |
@HEM42 Ah! Good thing you clarified then, as I had misunderstood / misread the logic. Thank you. I think I'll have to consider myself neutral then, at least until I've pondered on the whole "delete parts of the tree" thing here. If |
@HEM42 Another follow up on this, as I couldn't get your explaination to make sense after reading the SQL again. So I basically made this minimal example: create table minion_jobs(id integer, state text, parents integer[]);
insert into minion_jobs(id, state, parents) values
(1, 'finished', null),
(2, 'finished', '{1}'),
(3, 'failed', '{1}'),
(4, 'finished', '{1}'),
(5, 'inactive', '{2,3,4}'); when running a slightly modified query (skipping the timestamp for convenience here) gives this result:
What am I missing? |
The docs for
Which was the reason for #129 in the first place, and we think we have good solution to bring it back. |
Not missing anything, but if you where to not fail job with id 3. You would then see
|
any updates on this ? is there need for more information in order to take a decision |
Summary
Reimplementation of previous repair, but without significant performance hit on large amount of jobs
Motivation
Restore repair functionality removed with the release of v10.27, but where this time it works on large scale installations, see referenced issue for numbers
References
#129