Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multiple metrics APIs #4261

Closed
4 tasks
JorTurFer opened this issue Oct 2, 2023 · 16 comments
Closed
4 tasks

Support for multiple metrics APIs #4261

JorTurFer opened this issue Oct 2, 2023 · 16 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling.

Comments

@JorTurFer
Copy link

JorTurFer commented Oct 2, 2023

Enhancement Description

  • One-line enhancement description (can be used as a release note):
    • Add support for multiple metrics APIs, allowing multiple services exposing metrics on same metric type
  • Kubernetes Enhancement Proposal: KEP-4261: Add support for multiple metrics APIs #4262
  • Discussion Link: This hasn't been discussed yet, this limitation has been there for a while and another issue was opened some time ago
  • Primary contact (assignee): @JorTurFer
  • Responsible SIGs: SIG-Autoscaling
  • Enhancement target (which target equals to which milestone):
    • Alpha release target (x.y):
    • Beta release target (x.y):
    • Stable release target (x.y):
  • Alpha
    • KEP (k/enhancements) update PR(s):
    • Code (k/k) update PR(s):
    • Docs (k/website) update PR(s):
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Oct 2, 2023
@sftim
Copy link
Contributor

sftim commented Oct 2, 2023

another issue was opened some time ago

What other issue? A hyperlink would be handy.

@sftim
Copy link
Contributor

sftim commented Oct 2, 2023

KEP title suggestion: Support for multiple metrics APIs

(the “add” is implied)

@JorTurFer JorTurFer changed the title Add support for multiple metrics APIs Support for multiple metrics APIs Oct 2, 2023
@JorTurFer
Copy link
Author

another issue was opened some time ago

What other issue? A hyperlink would be handy.

f**k me, copy paste issue 🤦 , updating

@vaibhav2107
Copy link
Member

/sig autoscaling

@k8s-ci-robot k8s-ci-robot added sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 11, 2023
@neoakris
Copy link

neoakris commented Jan 4, 2024

https://www.doit.com/kubernetes-custom-metric-autoscaling-almost-great/
^-- For context of why this is needed. + I'm posting to subscribe / follow the thread.

@sftim
Copy link
Contributor

sftim commented Jan 4, 2024

#2580 (comment) stated:

This can be completed as a design doc/PR to community repo rather than a KEP as it's out of tree.

Is that still true?

@JorTurFer
Copy link
Author

JorTurFer commented Jan 4, 2024

#2580 (comment) stated:

This can be completed as a design doc/PR to community repo rather than a KEP as it's out of tree.

Is that still true?

@dgrisonnet ? You are member of SIG-Instrumentation so maybe you can help us. As this is a KEP for SIG-Autoscaling I think that it doesn't apply anymore

@neoakris
Copy link

neoakris commented Jan 4, 2024

#2580 (comment) stated:

This can be completed as a design doc/PR to community repo rather than a KEP as it's out of tree.

Is that still true?

It was never correct / it was an incorrect conclusion from the beginning IMO.
(they mentioned recalling a discussion from a meeting, but never provided written justification for the conclusion.)
This change would need to be done in-tree / modify the Kubernetes code base.
It won't work as a PR to a community repo
, the link to an article I wrote that I shared earlier explains the context of why that's the case. (It's a short read. that tries to balance explaining without going too down into the weeds.)

Sadly many community repos like KEDA.sh are doing hacky workarounds like supporting 60 Scalers & duplicating functionality provided by other tools to work around the limitation, but that's not a good solution as it blocks other projects like knative, and prevents reusable IaC, since only 1 custom metric server is supported atm, rather than multiple, and some kind of namespace like functionality to allow multiple custom metric servers to exist is required to allow reusable IaC that won't interfere with / be blocked from working if another kubernetes app is installed.

@sftim
Copy link
Contributor

sftim commented Jan 4, 2024

@neoakris I've read that article, and I actually don't see what part of Kubernetes' control plane needs to change to make this happen.

An out-of-tree aggregating proxy could work for the solution I have in mind, and it'd align with what the article hopes for. Also, an out-of-tree proxy would be much, much simpler than extending APIService or the front proxy layer to implement that aggregation within the control plane itself.

So, I like the idea, but I'm not convinced that the obstacle is the lack of a signed-off KEP. I think it's just waiting on a supply of contributor capacity to make the thing we'd need.

@JorTurFer
Copy link
Author

JorTurFer commented Jan 4, 2024

The problem of adding it as an external proxy is that it doesn't allow migrations. @dgrisonnet and I were thinking about that option as first option, but how will you migrate already existing clusters? If you are already using something via those apis, you have to drop them during the migration, which can be not acceptable in some scenarios.
In front, if we add the support at k8s level, it'll be available out-of-the-box, so end users who want to use multiple services on large clusters just need to upgrade their clusters and take advantage of this feature.

For example, there are KEDA users with more than 1k ScaledObjects, losing the autoscaling on those scenarios is simply not acceptable as there isn't any progressive migration path, you have to drop KEDA scaling, set the proxy, and proxy the request to KEDA. If something goes wrong, you could have several outages and this risk can block large users

So, I like the idea, but I'm not convinced that the obstacle is the lack of a signed-off KEP. I think it's just waiting on a supply of contributor capacity to make the thing we'd need.

@sftim , It's the current obstacle indeed! As this is required for us in KEDA, I'm already working on a PR based on the proposed KEP, assuming that maybe I work for literally nothing if the KEP is rejected or the design changes dramatically. A signed KEP with consensus about the approach would be nice

@sftim
Copy link
Contributor

sftim commented Jan 4, 2024

Can we shift this to a Slack discussion @JorTurFer ? I can see a way to migrate; maybe there is an obstacle, but a KEP issue is not the place for this kind of detailed / early design work.

@JorTurFer
Copy link
Author

sure!, I've written by slack (in Kubernetes workspace)

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 3, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 3, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling.
Projects
None yet
Development

No branches or pull requests

6 participants