-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add ECS mode for Helm deployment #33
Conversation
@@ -1,56 +0,0 @@ | |||
mode: daemonset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that we only use the Deployment
?
If so, will that work with a multi-node cluster?
The purpose of the Daemonset is to ensure that node level metrcis+logs are collected from each node by the respective Collector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch! I assumed we were already deploying it as a daemonset. I think that as we are enabling the hostmetrics
and filelog
receiver the default mode should be daemonset
and remove the deployment one.
Mode changed in 80612a5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then we will be collecting k8scluster
metrics and events from all the nodes, right? Unless we want to remove them for some reason?
In general, the Collector is recommended to be deployed in both modes at the same time and each Resource handels its own part. The Daemonset takes care of the node level collection and the Deployment acts a singleton collecting the cluster level signals (metrics but traces from the SDKs as well).
Or we don't want to achieve this here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then we will be collecting k8scluster metrics and events from all the nodes, right? Unless we want to remove them for some reason?
k8scluster metrics are disabled when deployed as a deamonset: https://github.com/open-telemetry/opentelemetry-helm-charts/tree/main/charts/opentelemetry-collector#configuration-for-kubernetes-cluster-metrics
In general, the Collector is recommended to be deployed in both modes at the same time and each Resource handels its own part.
Makes sense to me, thanks for the clarification. If it sounds good to you, I will rollback the removal of the daemonset
file and document the following two deployment modes:
- Deployment: Otlp receiver + k8scluster metrics + Otlp exporter
- Daemonset: Hostmetrics + filelogreceiver (and its processors) + elasticsearch exporter
My main concern would be with the logs, the demo services send Otlp logs that can be either gathered by the Otlp receiver or the filelog receiver. Should we exclude those from being collected with the filelog receiver to prevent duplications?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deployment: Otlp receiver + k8scluster metrics + Otlp exporter
Daemonset: Hostmetrics + filelogreceiver (and its processors) + elasticsearch exporter
That sounds good! For the Daemonset all the receivers that make sense on Node scope and the Deployment everything that is cluster scope (k8scluster metrics, k8s events etc).
For the logs duplication one thing we can do is to apply the filter processor to drop logs collected by the filelog receiver in case some k8s labels apply? Or based on a similar pattern, we can annotate the instrumented apps with a specific annotation to indicate that these should be dropped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the logs duplication one thing we can do is to apply the filter processor to drop logs collected by the filelog receiver in case some k8s labels apply? Or based on a similar pattern, we can annotate the instrumented apps with a specific annotation to indicate that these should be dropped.
The issue is that not all the Otel demo services have been migrated to Otel logs, instead some output to stdout and smothers to an Otel logs provider. See tracking issue: open-telemetry#1472.
This is an important topic to tackle, but I think it would make sense to move the discussion and implementation to its own issue/pr, wdyt?
8aa0cec
to
d2bc5fd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
k8s_api_config: | ||
auth_type: serviceAccount | ||
metrics: | ||
k8s.pod.cpu.node.utilization: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can also enable the k8s.pod.memory.node.utilization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! Saw it now, sure I will create a follow-up PR. Thanks!
Changes
hostmetrics
andfilelog
receiver and its corresponding processors to properly decorate the metrics/logs. (internal manifest has been used as reference)elasticsearch
exporter and associated secrets.elasticsearch
exporter.Merge Requirements
For new features contributions please make sure you have completed the following
essential items:
* [ ]CHANGELOG.md
updated to document new feature additions* [ ] Appropriate documentation updates in the docs* [ ] Appropriate Helm chart updates in the helm-chartsMaintainers will not merge until the above have been completed. If you're unsure
which docs need to be changed ping the
@open-telemetry/demo-approvers.