-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make GCThrottle configurable #1402
Comments
Another interesting aspect in this area is the way UUID id's are selected for GC. So in an extreme example you could end up in this sort of scenario
In the above scenario all 50 GC'd rows would be from the same updater as the GC query output is a row per updater and all the UUIDs then get added to a big slice a row (ie updater) at a time before the first 50 are taken. We observed the amount of cascade deletes varies a lot per GC run. This is because it may delete 50 UUIDs from a small amount of updaters that have very little impact on uo_vuln or 50 UUIDS from a single updater that has many has a big impact. Therefore I was wondering if it took the first element of each row (ie a UUID per updater), until it hit the GCThrottle limit, that would end up with a more even amount of rows CASCADE deleted from uo_vuln on each GC run. In the worst case where the updater in row 1 has > 50 rows eligible for GC the change would also have the effect that you would be deleting the oldest update_operation for 50 different updaters rather than the 50 oldest for a single updater which should mean you are more likely to be deleting the stalest data first. I accept this is likely a marginal gain for a scenario of being behind on GC by a considerable amount, but thought I would suggest/air it any way. |
I don't think we should make this tunable as-is, as the next-gen matcher database will have different semantics, which means the knob over in Clair would most likely be immediately depreciated. |
We have recently observed a larger than usual number of updates to the
update_operations table
(believe this a legitimate increase) and in our smaller dev environments, where we have a single instance running the updaters and therefore GC runs once every 6 hours, we saw 500+ remaining_ops being logged after a GC run. With the current hardcoded limit of 50 this can take a very long time to get through especially when more updates are coming in.This large backlog of GC actions then has a large impact on the uo_vuln table size which, in the dev environments where we have less resources for the DB, starts to have a large impact on performance.
If this was configurable parameter via config we could tune it in such periods to help ourselves catch up with GC without having to do it manually or by running more pods (More pods really doesn't work that well as they often run at the same time running at the same time)
The text was updated successfully, but these errors were encountered: