Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add blog post for OTel Operator Q&A #4359

Merged
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
b34c427
New blog post: OTel Operator Q&A
avillela Apr 24, 2024
3299aab
Add image
avillela Apr 24, 2024
10bb133
Added one more question
avillela Apr 24, 2024
cc40f6b
Fix linting issues
avillela Apr 25, 2024
0b5d2ec
Fix linting issues
avillela Apr 25, 2024
a67e3ed
Fix linting issues
avillela Apr 25, 2024
9ee7e2d
Fix linting issues
avillela Apr 25, 2024
50dc319
Fix linting issues
avillela Apr 25, 2024
2b3431b
Make token example more generic
avillela Apr 25, 2024
c7f5466
Make token example more generic
avillela Apr 25, 2024
df926d0
Incorporate suggestions
avillela Apr 25, 2024
cc88653
Prettify
avillela Apr 25, 2024
90e0601
Updates from PR comments
avillela Apr 25, 2024
dbd280e
Fix typo
avillela Apr 25, 2024
1c07a7f
Update section on collector distros
avillela Apr 25, 2024
d4d91e6
Prettify
avillela Apr 25, 2024
9316d9e
Fix quesiton numbering
avillela Apr 26, 2024
e07f539
Update content/en/blog/2024/otel-operator-q-and-a/index.md
avillela May 6, 2024
6743b58
Update content/en/blog/2024/otel-operator-q-and-a/index.md
avillela May 6, 2024
f6bd948
Update content/en/blog/2024/otel-operator-q-and-a/index.md
avillela May 6, 2024
356140f
Update content/en/blog/2024/otel-operator-q-and-a/index.md
avillela May 6, 2024
ff76b4a
Update content/en/blog/2024/otel-operator-q-and-a/index.md
avillela May 6, 2024
43d8d5e
Update content/en/blog/2024/otel-operator-q-and-a/index.md
avillela May 6, 2024
29c2c85
Update content/en/blog/2024/otel-operator-q-and-a/index.md
avillela May 6, 2024
c352767
Update per feedback from @swiatekm-sumo
avillela May 6, 2024
554aab0
Update per feedback from @swiatekm-sumo + update blog post title
avillela May 9, 2024
214da56
minor updates and prettify
avillela May 9, 2024
d9e3281
Merge branch 'main' into avillela-blog-post-operator-q-and-a
svrnm May 13, 2024
72a7ae3
Update content/en/blog/2024/otel-operator-q-and-a/index.md
svrnm May 13, 2024
1866a48
Update content/en/blog/2024/otel-operator-q-and-a/index.md
svrnm May 13, 2024
e2ab3cd
Results from /fix:refcache
opentelemetrybot May 13, 2024
6caf1be
Merge branch 'main' into avillela-blog-post-operator-q-and-a
svrnm May 13, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
340 changes: 340 additions & 0 deletions content/en/blog/2024/otel-operator-q-and-a/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,340 @@
---
title: OTel Operator Q&A
linkTitle: OTel Operator Q&A
date: 2024-04-24
svrnm marked this conversation as resolved.
Show resolved Hide resolved
author: >-
[Adriana Villela](https://github.com/avillela) (ServiceNow),

canonical_url: https://adri-v.medium.com/81d63addbf92?
cSpell:ignore: automagically mycollector
---

![Seattle's Mount Rainier rising about the clouds, as seen from an airplane. Photo by Adriana Villela](mount-rainier.jpg)
svrnm marked this conversation as resolved.
Show resolved Hide resolved

The
[OpenTelemetry (OTel) Operator](https://github.com/open-telemetry/opentelemetry-operator)
is a
[Kubernetes Operator](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/)
that manages OTel things for you in your Kubernetes cluster to make life a
avillela marked this conversation as resolved.
Show resolved Hide resolved
little easier. It does the following:

- Manages deployment of the
avillela marked this conversation as resolved.
Show resolved Hide resolved
[OpenTelemetry Collector](http://localhost:1313/docs/collector/), supported by
the
[`OpenTelemetryCollector`](https://github.com/open-telemetry/opentelemetry-operator?tab=readme-ov-file#getting-started)
[custom resource (CR)](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/)
avillela marked this conversation as resolved.
Show resolved Hide resolved
- Manages the configuration of a fleet of OpenTelemetry Collectors via
[OpAMP](/docs/specs/opamp/) integration, supported by the
[`OpAMPBridge`](https://github.com/open-telemetry/opentelemetry-operator/blob/main/docs/api.md#opampbridge)
custom resource
avillela marked this conversation as resolved.
Show resolved Hide resolved
- Injects and configures
[auto-instrumentation](https://www.honeycomb.io/blog/what-is-auto-instrumentation)
into your pods, supported by the
[`Instrumentation`](https://github.com/open-telemetry/opentelemetry-operator?tab=readme-ov-file#opentelemetry-auto-instrumentation-injection)
custom resource
avillela marked this conversation as resolved.
Show resolved Hide resolved

I've had a chance to use the Operator in the last year, and learned some pretty
cool things, so I thought it might be helpful to share some little OTel Operator
goodies that I’ve picked up along the way, in the form of a Q&A.

Please note that this post assumes that you have some familiarity with
OpenTelemetry, the
[OpenTelemetry Collector](http://localhost:1313/docs/collector/), the
[OpenTelemetry Operator](https://github.com/open-telemetry/opentelemetry-operator)
(including the
avillela marked this conversation as resolved.
Show resolved Hide resolved
[Target Allocator](https://adri-v.medium.com/prometheus-opentelemetry-better-together-41dc637f2292)),
and [Kubernetes](https://kubernetes.io).

## Q&A

### Q1: Does the Operator support multiple Collector configuration sources?

Short answer: no.
avillela marked this conversation as resolved.
Show resolved Hide resolved

Longer answer: OTel Collector can be fed more than one Collector config YAML
file. That way, you can keep your base configurations in say,
avillela marked this conversation as resolved.
Show resolved Hide resolved
`otelcol-config.yaml`, and overrides or additions to the base configuration can
go in say, `otelcol-config-extras.yaml`. You can see an example of this in the
avillela marked this conversation as resolved.
Show resolved Hide resolved
[OTel Demo’s Docker compose file](https://github.com/open-telemetry/opentelemetry-demo/blob/06f020c97f78ae9625d3a4a5d1107c55045c567f/docker-compose.yml#L665-L668).

Unfortunately, while the OTel Collector supports multiple Collector
configuration files, the Collector managed by the OTel Operator does not.

To get around this, you could merge the multiple Collector configs through some
external tool beforehand. For example, if you
[were deploying the Operator via Helm](https://github.com/open-telemetry/opentelemetry-helm-charts/tree/main/charts/opentelemetry-operator),
you could technically
[pass it multiple Collector config files using multiple --values flags](https://stackoverflow.com/a/56653384)
and let [Helm](https://helm.sh) do the merging for you.

avillela marked this conversation as resolved.
Show resolved Hide resolved
For reference,
[check out this thread in the #otel-operator CNCF Slack channel](https://cloud-native.slack.com/archives/C033BJ8BASU/p1709321896612279).

### Q2: How can I securely reference access tokens in the OpenTelemetryCollector’s configuration?
avillela marked this conversation as resolved.
Show resolved Hide resolved

In order to send OpenTelemetry data to an observability backend, you must define
at least one [exporter](/docs/collector/configuration/#exporters). Whether you
use [OTLP](/docs/specs/otel/protocol/) or
[some proprietary vendor format](/docs/specs/otel/protocol/), most exporters
typically require that you specify an endpoint and an access token when sending
data to a vendor backend.

When using the OpenTelemetry Operator to manage the OTel Collector, the OTel
Collector config YAML is defined in the
[OpenTelemetryCollector](https://github.com/open-telemetry/opentelemetry-operator?tab=readme-ov-file#getting-started)
CR. This file should be version-controlled and therefore shouldn’t contain any
sensitive data, including access tokens stored as plain text.

Fortunately, the `OpenTelemetryCollector` CR gives us a way to reference that
value as a secret. Here’s how you do it:

1- Create a Kubernetes secret for your access token. Remember to
[base-64 encode](https://www.base64encode.org/) the secret.

2-
[Expose the secret as an environment variable](https://kubernetes.io/docs/concepts/configuration/secret/#using-a-secret)
by adding it to the `OpenTelemetryCollector` CR’s
[`env` section](https://github.com/avillela/otel-target-allocator-talk/blob/21e9643e28165e39bd79f3beec7f2b1f989d87e9/src/resources/02-otel-collector-ls.yml#L16-L21).
For example:

```yaml
env:
- name: LS_TOKEN
avillela marked this conversation as resolved.
Show resolved Hide resolved
valueFrom:
secretKeyRef:
key: LS_TOKEN
name: otel-collector-secret
```

3- Reference the environment variable in your
[exporter definition](https://github.com/avillela/otel-target-allocator-talk/blob/21e9643e28165e39bd79f3beec7f2b1f989d87e9/src/resources/02-otel-collector-ls.yml#L43-L47):

```yaml
exporters:
otlp/ls:
endpoint: 'ingest.lightstep.com:443'
avillela marked this conversation as resolved.
Show resolved Hide resolved
headers:
'lightstep-access-token': '${LS_TOKEN}'
avillela marked this conversation as resolved.
Show resolved Hide resolved
```

For more info, check out my full example
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should also mention authentication option in the exporter?
Not sure if its worth to mention, but in case you use some auth container that writes such token into a file, there is the bearer token auth ext. that can be used to watch the file and set the header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@frzifus TBH I don't know that I would be qualified to talk about this, just because I've never played around with the authentication option in the exporter.

[here](https://github.com/avillela/otel-target-allocator-talk/blob/main/src/resources/02-otel-collector-ls.yml),
avillela marked this conversation as resolved.
Show resolved Hide resolved
along with full instructions
[here](https://github.com/avillela/otel-target-allocator-talk/tree/main?tab=readme-ov-file#3b--kubernetes-deployment-servicenow-cloud-observability-backend).
avillela marked this conversation as resolved.
Show resolved Hide resolved

### Q3: Is the Operator version at parity with the Collector version?

The default version of the Collector used by the Operator is typically behind by
avillela marked this conversation as resolved.
Show resolved Hide resolved
one version at most. For example, at the time of this writing, the latest
Operator version is 0.98.0, and the latest Collector version is 0.99.0. In
addition, the default image of the Collector used by the Operator is the
[core distribution](/blog/2024/otel-collector-anti-patterns/#3--not-using-the-right-collector-distribution-or-not-building-your-own-distribution)
(as opposed to the contrib distribution).

### Q4: Can I override the base OTel Collector image?

Yes! In fact,
[you probably should](https://cloud-native.slack.com/archives/C033BJ8BASU/p1713894678225579)!

As we saw earlier, the
[core distribution](https://github.cm/open-telemetry/open-telemetry-collector)
is the default Collector distribution used by the `OpenTelemetryCollector` CR.
The Core distribution is a bare-bones distribution of the Collector for OTel
developers to develop and test. It contains a base set of components–i.e.
[extensions](/docs/collector/configuration/#service-extensions),
[connectors](/docs/collector/configuration/#connectors),
[receivers](/docs/collector/configuration/#receivers),
[processors](/docs/collector/configuration/#processors), and
[exporters](/docs/collector/configuration/#exporters).

If you want access to more components than the ones offered by core, you can use
the contrib distribution instead. The contrib distribution extends the core
distribution, and includes components created by third-parties (including
vendors and individual community members), that are useful to the OpenTelemetry
community at large.
avillela marked this conversation as resolved.
Show resolved Hide resolved

Or better yet, if you want to use specific Collector components, you can build
your own distribution using the
[OpenTelemetry Collector Builder](/docs/collector/custom-collector/) (OCB), and
include only the components that you need.

Either way, the OpenTelemetryCollector CR allows you to override the default
Collector image with one that better suits your needs by `adding spec.image` to
your `OpenTelemetryCollector` YAML. In addition, you can also specify the number
of Collector replicas by adding `spec.replicas`. This is totally independent of
whether or not you override the Collector image.

Your code would look something like this:

```yaml
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
name: otelcol
namespace: mynamespace
spec:
mode: statefulset
image: <my_collector_image>
replicas: <number_of_replicas>
```

Where:

- `<my_collector_image>` is the name of a valid Collector image from a container
repository
- `<number_of_replicas>` is the number of pod instances for the underlying
OpenTelemetry Collector

Keep in mind that if you're pulling a Collector image from a private container
registry, you'll need to use
[`imagePullSecrets`](https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/).
Since private container registries require authentication, this will enable you
to authenticate against that private registry. For more info on how to use
`imagePullSecrets` for your Collector image, check out the instructions
[here](https://github.com/open-telemetry/opentelemetry-operator?tab=readme-ov-file#using-imagepullsecrets).
avillela marked this conversation as resolved.
Show resolved Hide resolved

For more info, check out the
[OpenTelemetryCollector CR API docs](https://github.com/open-telemetry/opentelemetry-operator/blob/main/docs/api.md#opentelemetrycollector).

### Q4: Does the Target Allocator work for all deployment types?

No. The Target Allocator only works for
[Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/),
avillela marked this conversation as resolved.
Show resolved Hide resolved
[StatefulSet](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/),
and
[DaemonSet](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/)
([newly-introduced](https://github.com/open-telemetry/opentelemetry-operator/pull/2430#discussion_r1420495631)).
For reference, check out
[this discussion](https://cloud-native.slack.com/archives/C033BJ8BASU/p1709935402250859).

### Q5: If I’m using Operator’s Target Allocator for Prometheus service discovery, do I need `PodMonitor` and `ServiceMonitor` CRs installed in my Kubernetes cluster?

Yes, you do. These CRs are bundled with the
[Prometheus Operator](https://github.com/prometheus-operator/prometheus-operator);
however, they can be installed standalone, which means that you don’t need to
install the Prometheus Operator just to use these two CRs with the Target
Allocator.
swiatekm marked this conversation as resolved.
Show resolved Hide resolved

The easiest way to install the
[`PodMonitor`](https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#monitoring.coreos.com/v1.PodMonitor)
and
[`ServiceMonitor`](https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#monitoring.coreos.com/v1.ServiceMonitor)
CRs is to grab a copy of the individual
[PodMonitor YAML](https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/charts/crds/crds/crd-podmonitors.yaml)
and
[ServiceMonitor YAML](https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/charts/crds/crds/crd-servicemonitors.yaml)
[custom resource definitions (CRDs)](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/),
like this:

```shell
kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.71.2/example/prometheus-operator-crd/monitoring.coreos.com_servicemonitors.yaml

kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.71.2/example/prometheus-operator-crd/monitoring.coreos.com_podmonitors.yaml
```

Check out my example of the OpenTelemetry Operator’s Target Allocator with
`ServiceMonitor`
[here](https://github.com/avillela/otel-target-allocator-talk/tree/main?tab=readme-ov-file#3b--kubernetes-deployment-servicenow-cloud-observability-backend).
avillela marked this conversation as resolved.
Show resolved Hide resolved

### Q6: Do I need to create a service account to use the Target Allocator?

No, but you do need to do a bit of extra work. So, here’s the deal…although you
need a
[service account](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/)
to use the Target Allocator, you don’t have to create your own.

If you enable the Target Allocator and don’t create a service account, one is
automagically created for you. This service account’s default name is a
concatenation of the Collector name (`metadata.name` in the
`OpenTelemetryCollector` CR) and `-collector`. For example, if your Collector is
called `mycollector`, then your service account would be called
`mycollector-collector`.

By default, this service account has no defined policy. This means that you’ll
still need to create your own
[`ClusterRole`](https://kubernetes.io/docs/reference/access-authn-authz/rbac/#role-and-clusterrole)
and
[`ClusterRoleBinding`](https://kubernetes.io/docs/reference/access-authn-authz/rbac/#rolebinding-and-clusterrolebinding),
and associate the `ClusterRole` to the `ServiceAccount` via
`ClusterRoleBinding`.

See the
[Target Allocator readme](https://github.com/open-telemetry/opentelemetry-operator/tree/main/cmd/otel-allocator#rbac)
for more on Target Allocator RBAC configuration.
avillela marked this conversation as resolved.
Show resolved Hide resolved

### Q7: Can I override the Target Allocator base image?

Just like you can override the Collector base image in the
`OpenTelemetryCollector` CR, you can also override the Target Allocator base
image.

Please keep in mind that
[it’s usually best to keep the Target Allocator and OTel operator versions the same](https://cloud-native.slack.com/archives/C033BJ8BASU/p1709128862949249?thread_ts=1709081221.484429&cid=C033BJ8BASU),
to avoid any compatibility issues. If do you choose to override the Target
Allocator’s base image, you can do so by adding `spec.targetAllocator.image` in
the `OpenTelemetryCollector` CR. You can also specify the number of replicas by
adding `spec.targetAllocator.replicas`. This is totally independent of whether
or not you override the TA image.

Your code would look something like this:

```yaml
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
name: otelcol
namespace: mynamespace
spec:
mode: statefulset
targetAllocator:
image: <ta_image_name>
replicas: <number_of_replicas>
```

Where:

- `<ta_image_name>` is a valid Target Allocator image from a container
repository.
- `<number_of_replicas>` is the number of pod instances for the underlying
Target Allocator

### Q8: If it’s not recommended that you override the Target Allocator base image, then why would you want to?

One use case might be
[if you need to host a mirror of the Target Allocator image in your own private container registry for security purposes](https://cloud-native.slack.com/archives/C033BJ8BASU/p1713894678225579).

If you do need to reference a Target Allocator image from a private registry,
you’ll need to use `imagePullSecrets`. To use `imagePullSecrets` with the OTel
Operator, check out the instructions
[here](https://github.com/open-telemetry/opentelemetry-operator?tab=readme-ov-file#using-imagepullsecrets).
avillela marked this conversation as resolved.
Show resolved Hide resolved
Note that you don’t need to create a `serviceAccount` for the Target Allocator,
since once is already created for you automagically if you don’t create one
yourself (see
[Q6](#q6-do-i-need-to-create-a-service-account-to-use-the-target-allocator)).

For more info, check out the
[Target Allocator API docs](https://github.com/open-telemetry/opentelemetry-operator/blob/main/docs/api.md#opentelemetrycollectorspectargetallocator).

### Q9: Is there a version lag between the OTel Operator auto-instrumentation and auto-instrumentation of supported languages?

If there is a lag, it's minimal, as maintainers try to keep these up to date for
each release cycle. Keep in mind that there are breaking changes in some
semantic conventions and the team is trying to avoid breaking users' code. More
info
[here](https://cloud-native.slack.com/archives/C033BJ8BASU/p1713894678225579).
avillela marked this conversation as resolved.
Show resolved Hide resolved

## Final Thoughts
avillela marked this conversation as resolved.
Show resolved Hide resolved

Hopefully this has helped to demystify the OTel Operator a bit more. There’s
definitely a lot going on, and the OTel Operator can certainly be a bit scary at
first, but understanding some of the basics will get you well on your way to
mastering this powerful tool.

If you have any questions about the OTel Operator, I highly recommend that you
post questions on the
[#otel-operator](https://cloud-native.slack.com/archives/C033BJ8BASU) channel on
the [CNCF Slack](https://communityinviter.com/apps/cloud-native/cncf).
Maintainers and contributors are super friendly, and have always been more than
willing to answer my questions! You can also
[hit me up](https://bento.me/adrianamvillela), and I'll try my best to answer
your questions, or to direct you to folks who have the answers!
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading