This repository contains several applications to support Kubernetes integration with the CloudZero platform, including:
-
CloudZero Insights Controller - provides telemetry to the CloudZero platform to enabling complex cost allocation and analysis. This webhook application securely receives resource provisioning and deprovisioning requests from the Kubernetes API. It collects resource labels, annotations, and relationship metadata between resources, ultimately supporting the identification of CSP resources not directly connected to a Kubernetes node.
-
CloudZero Collector - The collector application which implements a prometheus compliant interface for metrics collection; which writes the metrics payloads to files to a shared location for consumption by the shipper. Today the collector classifies incoming metrics data, and will save the data into either cost telemetry files, or into observability files. These files are compressed on disk to save space.
-
CloudZero Shipper - The shipper application monitors shared locations for metrics file creation, allocates pre-signed S3 PUT URLs for customers (using the
CloudZero upload API
), and then uploads data to the AWS S3 bucket at set intervals. This approach protects against invalid API keys and enables end-to-end file tracking. -
CloudZero Agent Validator - the validator application is part of the agentβs pod lifecycle hooks. It is responsible for performing basic validation checks, and notifying the CloudZero platform of installation status changes (initializing, started, stopping). This application runs during the lifecycle hook, then exits when complete.
Note the agent application which is responsible for executing metrics scrape jobs at various intervals. The agent will communicate with a kube-state-metrics exporter application, and cAdvisor exporter applications (one per machine instance). For large scale clusters, the agent runs in βfederated modeβ (aka daemonset mode), where each instance on each machine is responsible for metrics collection on that single machine.
The easiest way to get started with the CloudZero Insights Controller is by
using the cloudzero-agent
Helm chart from the cloudzero-charts
repository.
See the Installation Guide for details.
See the Configuration Guide for details.
make undeploy-admission-controller
make undeploy-test-app
The applications are based on a scratch container, so no shell is available. The container images are less than 8MB.
To monitor the data directory, you must deploy a debug
container as follows:
-
Deploy a debug container
kubectl apply -f cluster/deployments/debug/deployment.yaml
-
Attach to the shell of the debug container
kubectl exec -it temp-shell -- /bin/sh
To inspect the data directory,
cd /cloudzero/data
eksctl delete cluster -f cluster/cluster.yaml --disable-nodegroup-eviction
This project provides a collector application, written in golang, which provides two applications:
Collector
- the collector application exposes a prometheus remote write API which can receive POST requests from prometheus in either v1 or v2 encoded format. It decodes the messages, then writes them to thedata
directory as Brotri-compressed JSON.Shipper
- the shipper application watches the data directory looking for completed parquet files on a regular interval (eg. 10 min), then will call theCloudZero upload API
to allocate S3 Presigned PUT URLS. These URLs are used to upload the file. The application has the ability to compress the files before sending them to S3.
The output of the CloudZero Insights Controller application is a JSON object
that represents cloudzero
metrics, which is POSTed to the CloudZero remote
write API. The format of these objects is based on the Prometheus Timeseries
protobuf message, defined
here.
Protobuf definitions for the cloudzero
metrics are in the proto/
directory.
There are four kinds of objects that can be sent:
- Pod metrics
cloudzero_pod_labels
cloudzero_pod_annotations
__name__
; will be one of the valid pod metric namesnamespace
; the namespace that the pod is launched inresource_type
; will always bepod
for pod metrics
Example
{
"labels": [
{
"name": "__name__",
"value": "cloudzero_pod_labels"
},
{
"name": "namespace",
"value": "default"
},
{
"name": "pod",
"value": "hello-28889630-955wd"
},
{
"name": "resource_type",
"value": "pod"
},
{
"name": "label_batch.kubernetes.io/controller-uid",
"value": "cc52c38d-b461-40ab-a65d-2d5a68ac08e5"
},
{
"name": "label_batch.kubernetes.io/job-name",
"value": "hello-28889630"
},
{
"name": "label_controller-uid",
"value": "cc52c38d-b461-40ab-a65d-2d5a68ac08e5"
},
{
"name": "label_job-name",
"value": "hello-28889630"
}
],
"samples": [
{
"value": 1.0,
"timestamp": "1733378003953"
}
]
}
- Workload Metrics
cloudzero_deployment_labels
cloudzero_deployment_annotations
cloudzero_statefulset_labels
cloudzero_statefulset_annotations
cloudzero_daemonset_labels
cloudzero_daemonset_annotations
cloudzero_job_labels
cloudzero_job_annotations
cloudzero_cronjob_labels
cloudzero_cronjob_annotations
__name__
; will be one of the valid workload metric namesnamespace
; the namespace that the workload is launched inworkload
; the name of the workloadresource_type
; will be one ofdeployment
,statefulset
,daemonset
,job
, orcronjob
Example
{
"labels": [
{
"name": "__name__",
"value": "cloudzero_deployment_labels"
},
{
"name": "namespace",
"value": "default"
},
{
"name": "workload",
"value": "hello"
},
{
"name": "resource_type",
"value": "deployment"
},
{
"name": "label_component",
"value": "greeting"
},
{
"name": "label_foo",
"value": "bar"
}
],
"samples": [
{
"value": 1.0,
"timestamp": "1733378003953"
}
]
}
- Namespace Metrics
cloudzero_namespace_labels
cloudzero_namespace_annotations
__name__
; will be one of the valid namespace metric namesnamespace
; the name of the namespaceresource_type
; will always benamespace
for namespace metrics
Example
{
"labels": [
{
"name": "__name__",
"value": "cloudzero_namespace_labels"
},
{
"name": "namespace",
"value": "default"
},
{
"name": "resource_type",
"value": "namespace"
},
{
"name": "label_engr.os.com/component",
"value": "foo"
},
{
"name": "label_kubernetes.io/metadata.name",
"value": "default"
}
],
"samples": [
{
"value": 1.0,
"timestamp": "1733880410225"
}
]
}
- Node Metrics
cloudzero_node_labels
cloudzero_node_annotations
__name__
; will be one of the valid node metric namesnode
; the name of the noderesource_type
; will always benode
for node metrics
Example
{
"labels": [
{
"name": "__name__",
"value": "cloudzero_node_labels"
},
{
"name": "resource_type",
"value": "node"
},
{
"name": "label_alpha.eksctl.io/nodegroup-name",
"value": "spot-nodes"
},
{
"name": "label_beta.kubernetes.io/arch",
"value": "amd64"
}
],
"samples": [
{
"value": 1.0,
"timestamp": "1733880410225"
}
]
}
We appreciate feedback and contribution to this repo! Before you get started, please see the following:
Contact [email protected] for usage, questions, specific cases. See the CloudZero Docs for general information on CloudZero.
Please do not report security vulnerabilities on the public GitHub issue tracker. Email [email protected] instead.
CloudZero is the only cloud cost intelligence platform that puts engineering in control by connecting technical decisions to business results.:
- Cost Allocation And Tagging Organize and allocate cloud spend in new ways, increase tagging coverage, or work on showback.
- Kubernetes Cost Visibility Understand your Kubernetes spend alongside total spend across containerized and non-containerized environments.
- FinOps And Financial Reporting Operationalize reporting on metrics such as cost per customer, COGS, gross margin. Forecast spend, reconcile invoices and easily investigate variance.
- Engineering Accountability Foster a cost-conscious culture, where engineers understand spend, proactively consider cost, and get immediate feedback with fewer interruptions and faster and more efficient innovation.
- Optimization And Reducing Waste Focus on immediately reducing spend by understanding where we have waste, inefficiencies, and discounting opportunities.
Learn more about CloudZero on our website www.cloudzero.com
This project is licensed under the Apache 2.0 LICENSE.