-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restart of autoscaler causes Knative to perform four status update calls per active revision #14669
Comments
Unsure if you have any experience with API priority and fairness? If so can you comment on - knative/pkg#2756 |
Only to a certain degree. In particular interesting and not sure if possible ... could Knative itself request different priorities on its API calls depending on the queue that is being processed ? I think when it starts everything is on the slow queue ? But anyway, best is always to omit requests no matter how they are prioritized. That would be my preference here. As far as I understand it, the autoscaler at startup sets the desiredScale to -1 because it has no metrics yet. Just doing nothing instead sounds better and maybe has no side effects at all ? |
Yeah I agree - I was just highlighting turning off client-side limiting could be a workaround since you have some guards on the server |
I looked into this a little bit, and while I couldn't see any obvious side effects, that's far from a guarantee 😄 😟 I wonder if one possible way to move forward (assuming we want to add this) would be to:
Just thinking out loud a bit... |
We are running with this change patched in (and activated) for a couple of weeks now in production and have not observed any issues. Will open a PR. |
Opened PR without any configuration option but with the log statement. |
When restarting the autoscaler component, then Knative keeps itself busy for a while.
For every PodAutoscaler that is related to an active revision, I see this happening:
If you have just a few revisions in the system, then this does not really matter. If you have 1,000 active revisions, then this matters. Both autoscaler as well as controller must each perform two Kubernetes API calls per active revision. Assuming a QPS of 50, then this is 1,000 * 2 / 50/s = 40 s. This is the duration that it effectively cannot handle any other Knative related operation (creation of new KService etc) because it is throttled on the amount of KService calls.
With a code change like this, I can easily prevent this, but I do not know if this has negative side effects.
In what area(s)?
/area autoscale
Other classifications:
What version of Knative?
All recent versions.
Expected Behavior
A restart of the autoscaler should not cause unnecessary Kubernetes API calls for each active revision.
Actual Behavior
A restart of the autoscaler causes two Kubernetes API calls per active revision in both autoscaler and controller.
Steps to Reproduce the Problem
Have a KSvc with minScale=maxScale=1 in the system. Then restart the autoscaler.
The text was updated successfully, but these errors were encountered: