-
Notifications
You must be signed in to change notification settings - Fork 127
Hpa #3492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Hpa #3492
Conversation
Hi @nowjean! Welcome to the project! 🎉 Thanks for opening this pull request! |
✅ All required contributors have signed the F5 CLA for this PR. Thank you! |
I have hereby read the F5 CLA and agree to its terms |
Thank you for you contribution to the project. Please run |
I’ve completed 'make generate-all'. Could you please review my PR? |
172c009
to
d081d68
Compare
So this only affects the control plane, correct? We probably want to support this for the nginx data plane as well (seems like that would be the more beneficial use case). In order to configure deployment options for the data plane, it requires a bit more work, specifically in our APIs and the code itself. The NginxProxy CRD holds the deployment configuration for the nginx data plane, which the control plane uses to configure the data plane when deploying it. Here is a simple example of how we add a new field to the API to allow for configuring these types of deployment fields: #3319. |
I'd also love a more descriptive PR title, as well as a release note in the description so we can include this feature in our release notes :) |
@sjberman Yes, this PR only affects the control plane. Can we also implement HPA for the data plane? AFAIK, the data plane Deployment is created by the NginxProxy CRD, and its name depends on the Gateway's HPA only applies to Deployments with a fixed name, like:
So, I think we can't implement HPA via the Helm chart, especially since data plane and control plane pods are now separated in 2.0. |
@nowjean I updated my comment with a description on how it can be implemented on the data plane side. Glad we're on the same page :) |
Will manual test this PR for both control plane and data plane when we have all the changes :) |
@sjberman @salonichf5 I've pushed my changes to this PR. From my testing, the code correctly applies HPA to both the control plane and data plane. |
Testing applying these HPA for control plane and data plane pods
values.yaml
HPA details
Needed to install the metrics server (enabling insecure TLS) to get metrics for resource memory and should this be communicated to end user about setting additional fields if we want scaling to be active
values.yaml
I saw HPA get configured for control plane pod but i couldn't see one configured for data plane pod. Events from the nginx deployment and logs could normal.
The NginxProxy resource reflects resources value but not
So a couple of observations
What am I doing wrong in terms of testing ? @sjberman @nowjean |
@salonichf5 @sjberman Thanks for testing! Please refer to below guide and review my PR again. I've patched Makefile generate-crds
This option off the description of CRDs. Because, new nginxProxy manifest file occurs
(In my case, I had to upgrade my runc version to build ngf docker images.)
End-users can create multiple Gateways, and each one needs its own HPA, so the logic now lives in the Gateway resource. Plus, I'm not sure about this part:
Normally, we assume that end users already have the Metrics Server running if they're using HPA or similar features. But maybe it's worth adding a note in the docs to avoid confusion. |
c57e992
to
e8399d9
Compare
Looks good now @nowjean ! Thank you so much for your contribution, I'll run the pipeline now -- need to ensure the CRD changes don't lead to any issues. I did verify and ran into the issue. The values.yaml
|
@salonichf5 Thanks, I got some errors in the pipeline and fixing now.. So, How can I run the pipeline? |
re-running it for you now, only we can approve pipeline's run but i'll keep a closer look at your PR. Appreciate your work :) can you rebase your work with main?
|
pre-commit.ci autofix |
@salonichf5 Thanks for your guide:)
After that, I checked commit history my hpa branch.
|
I forgot to run |
Thank you for contribution. I think we are close. I have a few more code comments. And I will also get someone else from the team to look at it. Great job 🚀 |
Thanks! Anyway, We've got 6 commits. Please Let me know if you'd like me to squash them into one commit😛 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much @nowjean for all your work on this, we really appreciate it!
I'm sorry you've encountered difficulties with the CRD metadata - in the interests of having atomic changes, we have created a separate issue for this CRD apply problem you are facing, and we will address this in a separate PR. Removing the descriptions completely from the CRDs is not an ideal solution, as we rely on these descriptions for our API doc generation, and many external tools also use these descriptions for enhanced functionality.
For now, I believe using the kubectl apply --server-side
option for applying the CRD should provide a workaround for you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of small final comments, otherwise it looks great!
@@ -8,13 +8,15 @@ rules: | |||
- apiGroups: | |||
- "" | |||
- apps | |||
- autoscaling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's only add this if autoscaling is enabled
- autoscaling | |
{{- if or .Values.nginx.autoscaling.enabled .Values.nginxGateway.autoscaling.enabled }} | |
- autoscaling | |
{{- end }} |
resources: | ||
- secrets | ||
- configmaps | ||
- serviceaccounts | ||
- services | ||
- deployments | ||
- daemonsets | ||
- horizontalpodautoscalers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's only add this if autoscaling is enabled
- horizontalpodautoscalers | |
{{- if or .Values.nginx.autoscaling.enabled .Values.nginxGateway.autoscaling.enabled }} | |
- horizontalpodautoscalers | |
{{- end }} |
@@ -502,7 +565,6 @@ certGenerator: | |||
|
|||
# -- A list of Gateway objects. View https://gateway-api.sigs.k8s.io/reference/spec/#gateway for full Gateway reference. | |||
gateways: [] | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: revert this change
|
||
func buildNginxDeploymentHPA( | ||
objectMeta metav1.ObjectMeta, | ||
autoSacaling ngfAPIv1alpha2.HPASpec, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
autoSacaling ngfAPIv1alpha2.HPASpec, | |
autoScaling ngfAPIv1alpha2.HPASpec, |
@@ -200,6 +200,7 @@ func (p *NginxProvisioner) provisionNginx( | |||
var agentConfigMapUpdated, deploymentCreated bool | |||
var deploymentObj *appsv1.Deployment | |||
var daemonSetObj *appsv1.DaemonSet | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: revert this change
@@ -388,6 +389,11 @@ type DeploymentSpec struct { | |||
// +optional | |||
Replicas *int32 `json:"replicas,omitempty"` | |||
|
|||
// Horizontal Pod Autoscaling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Horizontal Pod Autoscaling. | |
// Autoscaling defines the configuration for Horizontal Pod Autoscaling. |
// +kubebuilder:validation:XValidation:message="memory utilization must be between 1 and 100",rule="!has(self.targetMemoryUtilizationPercentage) || (self.targetMemoryUtilizationPercentage >= 1 && self.targetMemoryUtilizationPercentage <= 100)" | ||
// | ||
//nolint:lll | ||
type HPASpec struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a comment to this struct to describe what it is (similar to what I suggested above).
// | ||
//nolint:lll | ||
type HPASpec struct { | ||
// behavior configures the scaling behavior of the target |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// behavior configures the scaling behavior of the target | |
// Behavior configures the scaling behavior of the target |
// More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations | ||
// | ||
// +optional | ||
HPAAnnotations map[string]string `json:"hpaAnnotations,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there generally HPA-specific annotations that people use? Or is meant to be just generic annotations like any other resource?
I ask because the Gateway API pattern to set annotations on provisioned objects is through the Gateway spec.infrastructure.annotations
field. Even though every object has its own annotations people like to set, this Gateway resource field is the entry point to do so (which does set it on every object we provision, a side-effect of the way the Gateway API spec is defined).
// Minimum number of replicas. | ||
MinReplicas int32 `json:"minReplicas"` | ||
|
||
// Maximum number of replicas. | ||
MaxReplicas int32 `json:"maxReplicas"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does k8s have defaults for these values or are they required?
@@ -0,0 +1,46 @@ | |||
{{- if and (eq .Values.nginxGateway.kind "deployment") .Values.nginxGateway.autoscaling.enabled -}} | |||
apiVersion: {{ ternary "autoscaling/v2" "autoscaling/v2beta2" (.Capabilities.APIVersions.Has "autoscaling/v2") }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should .Capabilities.APIVersions.Has "autoscaling/v2"
be a part of the if
conditional above so that we aren't creating HPA if it can't exist?
@@ -62,7 +63,7 @@ func (h *eventHandler) HandleEventBatch(ctx context.Context, logger logr.Logger, | |||
case *gatewayv1.Gateway: | |||
h.store.updateGateway(obj) | |||
case *appsv1.Deployment, *appsv1.DaemonSet, *corev1.ServiceAccount, | |||
*corev1.ConfigMap, *rbacv1.Role, *rbacv1.RoleBinding: | |||
*corev1.ConfigMap, *rbacv1.Role, *rbacv1.RoleBinding, *autoscalingv2.HorizontalPodAutoscaler: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason we include these objects in the Update and Delete event handlers, is so that if a user manually updates or deletes them, we re-create them per the spec that's defined in the NginxProxy resource. It requires a little bit more work than just setting the object here.
See internal/controller/provisioner/store.go
and internal/controller/provisioner/setter.go
to ensure that the object is handled properly.
if nProxyCfg == nil || nProxyCfg.Kubernetes == nil { | ||
return nil | ||
} | ||
|
||
if !isAutoscalingEnabled(nProxyCfg.Kubernetes.Deployment) { | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: let's just combine these two conditions
// AutoscalingTemplate configures the additional scaling option. | ||
// | ||
// +optional | ||
AutoscalingTemplate *[]autoscalingv2.MetricSpec `json:"autoscalingTemplate,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we just name this field Metrics
, and then remove the TargetCPUand TargetMemory fields? Seems like we are defining specific metrics types when we could just keep it generic and a user defines whatever they want.
Proposed changes
Write a clear and concise description that helps reviewers understand the purpose and impact of your changes. Use the
following format:
Problem: I want NGF to work with a HorizontalPodAutoscaler
Solution: Add HPA for deployement
Testing: Describe any testing that you did.
I've deployed my AKS cluster and checked hpa working correctly.
Closes #3447
Checklist
Before creating a PR, run through this checklist and mark each as complete.
Release notes
If this PR introduces a change that affects users and needs to be mentioned in the release notes,
please add a brief note that summarizes the change.