Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/metrics #42

Merged
merged 3 commits into from
Dec 1, 2024
Merged

Fix/metrics #42

merged 3 commits into from
Dec 1, 2024

Conversation

mneverov
Copy link
Member

@mneverov mneverov commented Nov 30, 2024

Requested metrics had existed, but were not exposed.

This patch:

  • adds a new metrics server running on :9091 by default (see IPAM_METRICS_ADDR env. var)
  • moves probe and metrics server to util
  • replaces k8s.io/component-base/metrics/legacyregistry with prometheus/client_golang
  • metrics are registered in an init func instead of using sync.Once
curl -v localhost:9091/metrics
# HELP node_ipam_controller_multicidrset_allocation_tries_per_request Histogram measuring CIDR allocation tries per request.
# TYPE node_ipam_controller_multicidrset_allocation_tries_per_request histogram
node_ipam_controller_multicidrset_allocation_tries_per_request_bucket{clusterCIDR="10.244.0.0/16",le="1"} 3
node_ipam_controller_multicidrset_allocation_tries_per_request_bucket{clusterCIDR="10.244.0.0/16",le="5"} 3
node_ipam_controller_multicidrset_allocation_tries_per_request_bucket{clusterCIDR="10.244.0.0/16",le="25"} 3
node_ipam_controller_multicidrset_allocation_tries_per_request_bucket{clusterCIDR="10.244.0.0/16",le="125"} 3
node_ipam_controller_multicidrset_allocation_tries_per_request_bucket{clusterCIDR="10.244.0.0/16",le="625"} 3
node_ipam_controller_multicidrset_allocation_tries_per_request_bucket{clusterCIDR="10.244.0.0/16",le="+Inf"} 3
node_ipam_controller_multicidrset_allocation_tries_per_request_sum{clusterCIDR="10.244.0.0/16"} 0
node_ipam_controller_multicidrset_allocation_tries_per_request_count{clusterCIDR="10.244.0.0/16"} 3
node_ipam_controller_multicidrset_allocation_tries_per_request_bucket{clusterCIDR="2001:db8::/110",le="1"} 3
node_ipam_controller_multicidrset_allocation_tries_per_request_bucket{clusterCIDR="2001:db8::/110",le="5"} 3
node_ipam_controller_multicidrset_allocation_tries_per_request_bucket{clusterCIDR="2001:db8::/110",le="25"} 3
node_ipam_controller_multicidrset_allocation_tries_per_request_bucket{clusterCIDR="2001:db8::/110",le="125"} 3
node_ipam_controller_multicidrset_allocation_tries_per_request_bucket{clusterCIDR="2001:db8::/110",le="625"} 3
node_ipam_controller_multicidrset_allocation_tries_per_request_bucket{clusterCIDR="2001:db8::/110",le="+Inf"} 3
node_ipam_controller_multicidrset_allocation_tries_per_request_sum{clusterCIDR="2001:db8::/110"} 0
node_ipam_controller_multicidrset_allocation_tries_per_request_count{clusterCIDR="2001:db8::/110"} 3
# HELP node_ipam_controller_multicidrset_cidrs_allocations_total Counter measuring total number of CIDR allocations.
# TYPE node_ipam_controller_multicidrset_cidrs_allocations_total counter
node_ipam_controller_multicidrset_cidrs_allocations_total{clusterCIDR="10.244.0.0/16"} 3
node_ipam_controller_multicidrset_cidrs_allocations_total{clusterCIDR="2001:db8::/110"} 3
# HELP node_ipam_controller_multicidrset_usage_cidrs Gauge measuring percentage of allocated CIDRs.
# TYPE node_ipam_controller_multicidrset_usage_cidrs gauge
node_ipam_controller_multicidrset_usage_cidrs{clusterCIDR="10.244.0.0/16"} 0.01171875
node_ipam_controller_multicidrset_usage_cidrs{clusterCIDR="2001:db8::/110"} 0.0029296875
# HELP node_ipam_controller_multicirdset_max_cidrs Maximum number of CIDRs that can be allocated.
# TYPE node_ipam_controller_multicirdset_max_cidrs gauge
node_ipam_controller_multicirdset_max_cidrs{clusterCIDR="10.244.0.0/16"} 256
node_ipam_controller_multicirdset_max_cidrs{clusterCIDR="2001:db8::/110"} 1024

Fixes #40

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mneverov

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 30, 2024
@mneverov mneverov marked this pull request as draft November 30, 2024 17:54
@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 30, 2024
@mneverov mneverov marked this pull request as ready for review December 1, 2024 07:19
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 1, 2024
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 1, 2024
@mneverov
Copy link
Member Author

mneverov commented Dec 1, 2024

@aojea ptal

@aojea
Copy link
Contributor

aojea commented Dec 1, 2024

  • adds a new metrics server running on :9091 by default (see IPAM_METRICS_ADDR env. var)

we found that opening multiple ports for expose different endpoints ends causing problems, specially in host network pods, what do you think if we add a new flag --binding-address or similar to indicate the endpoint where the healthz, livez and metrics are exposed and mark the existing one health-probe-address as deprecated.

This way we just maintain one server and use one port, using the url to discriminate

@mneverov
Copy link
Member Author

mneverov commented Dec 1, 2024

what do you think if ...

Good idea, wanted to combine those two initially, but then checked controller-runtime and it has 3 webservers: webhook, probe, and metrics.

@@ -166,3 +166,11 @@ func runControllers(ctx context.Context, kubeClient kubernetes.Interface, cfg *r

nodeIpamController.Run(ctx)
}

func bindingAddress(cfg config) string {
if cfg.HealthProbeAddr != "" {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default HealthProbeAddr is empty now. Add this check for existing clusters.
The default addr value :8081 from HealthProbeAddr is now specified for WebserverBindAddr

@aojea
Copy link
Contributor

aojea commented Dec 1, 2024

/lgtm

Thanks

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 1, 2024
@k8s-ci-robot k8s-ci-robot merged commit bbf0e52 into kubernetes-sigs:main Dec 1, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

metrics for ipam-controller
3 participants