Skip to content

Commit

Permalink
Add blog post on Source CRDs
Browse files Browse the repository at this point in the history
  • Loading branch information
damemi committed Jan 15, 2025
1 parent c518ea3 commit 1ebc0ed
Showing 1 changed file with 298 additions and 0 deletions.
298 changes: 298 additions & 0 deletions markdown/docs/source-crd.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,298 @@
---
pubDate: 'Jan 24 2024'
title: 'The new Source of truth: introducing persistent, stateless instrumentation with Source objects'
image: '/collectors_cover.png'
category: 'Golang'
description: 'Learn about Source resources in Odigos'
tags: [golang, resource-management]
authorImage: '/mike.jpg'
author: Mike Dame
metadata: application management
---

We are excited to announce a change to the way you control instrumented workloads with Odigos: Source objects.

The new Source object is a Custom Resource that allows declarative
control over which workloads and namespaces are instrumented in your
cluster. This is an evolution from the current approach in which you
enable or disable instrumentation by setting the
`odigos-instrumentation=enabled` label directly on your workload.

An example Source object looks like this:

```yaml
apiVersion: odigos.io/v1alpha1
kind: Source
metadata:
name: my-source
namespace: default
spec:
workload:
name: my-app
namespace: default
kind: Deployment
```
With these objects, we can provide the same functionality and control that's offered today with several benefits, which we'll explain below.
But first, a few things to note:
- **Sources are namespaced objects**, created in the same namespace as
the workload they represent. This gives application teams the
independence to own their own instrumentation as well as providing
flexible RBAC control for system admin, DevOps, and SRE teams.
- **Sources identify a workload by name, namespace, and kind.** This
allows Odigos to uniquely identify the workload to be
instrumented. In the future, we hope to add wider identification
capabilities such as LabelSelectors to allow you to define custom
application topologies.
- **Sources are entirely decoupled from applications.** Instrumenting
a workload with Sources means that there is no change required to
the workload's definition. This allows Sources to persist beyond
the lifecycle of a workload, idempotently enabling workload
instrumentation even if it is deleted and re-deployed.
- **Sources exist 1:1 with applications.** Only one Source can exist
for each workload (except in the case of namespace instrumentation
– more on that later).
- **Source workload fields are immutable once created.** When a Source
is linked to a workload, it is linked to that workload forever
(until the Source is deleted).
## Why this change?
In Odigos today, instrumenting your applications is as simple as
applying a label to a workload or namespace. So why make the switch to
a resource?
While the simplicity of instrumentation labels is certainly
beneficial, this approach also has its drawbacks. When talking to
users about this, some issues became clear:
- Using a label on the workload creates a stateful relationship on the
workload's definition, meaning that if the workload is modified then
care must be taken to persist the label for instrumentation to
remain effective. This causes problems with continuous deployment
systems that might overwrite labels by default. It also means that
distributed teams need to be aware of their auto-instrumentation,
which shouldn't be a concern for them.
- Relying on the label's value means that our backend needs to not
only monitor the presence of the label, but also changes to its
value. This creates an edge-based system that is difficult for
Kubernetes operators to resiliently consume at scale.
- Kubernetes labels are intended for identifying and grouping
workloads, not as a control mechanism. In contrast, CRDs are a
common config pattern for users to interact with Kubernetes
operators.
Switching to a resource-based control mechanism addresses these by
leveraging idiomatic Kubernetes operator patterns to reconcile the
state of the cluster in a truly level-based design. It creates a
user-facing config surface to declare the desired state of the cluster
through a persistent and reliable Source of truth.
With Sources, you are declaring that a workload (or namespace) should
be instrumented, *forever*, as long as that Source exists. This means
the workload can be deleted, re-deployed, or modified in any way and
Odigos will still continue to recognize it.
This decoupled persistence is critical to the role of Odigos as a
low-overhead, non-invasive tool for auto-instrumentation.
## How to use sources?
Functionally, Source objects provide full feature parity with the
auto-instrumentation controls in Odigos today. That includes:
- Instrumenting individual workloads
- Instrumenting entire namespaces
- Excluding individual workloads from namespace instrumentation
### Instrumenting individual workloads
Say you have a simple app consisting of two deployments: `frontend`
and `backend`. To instrument both of these, you will create two Source
objects:

```yaml
apiVersion: odigos.io/v1alpha1
kind: Source
metadata:
name: frontend
namespace: default
spec:
workload:
name: frontend
namespace: default
kind: Deployment
---
apiVersion: odigos.io/v1alpha1
kind: Source
metadata:
name: backend
namespace: default
spec:
workload:
name: backend
namespace: default
kind: Deployment
```

When these objects are created, Odigos will find these workloads,
detect their runtimes, and automatically instrument them just as it
does today.

You can confirm that these Sources were created with `kubectl`:

```
$ kubectl get sources
NAME WORKLOAD KIND NAMESPACE
frontend frontend Deployment default
backend backend Deployment default
```

### Instrumenting entire namespaces

Instrumenting an entire namespace is similar to instrumenting
individual workloads, except it only requires one Source object for
the entire namespace. Following the above example, that would be:

```yaml
apiVersion: odigos.io/v1alpha1
kind: Source
metadata:
name: myapp-namespace
namespace: default
spec:
workload:
name: default
namespace: default
kind: Namespace
```

In this case, a Source object must set kind: Namespace with the name
and namespace equal to the namespace to instrument. This will
instrument all of the possible workloads in the namespace under the
single Source:

```
$ kubectl get sources
NAME WORKLOAD KIND NAMESPACE
myapp-namespace default Namespace default
```

You can also create an individual workload Source in the same
namespace. For example, you could also create the above frontend
Source, and the list for the namespace would now look like this:

```
$ kubectl get sources
NAME WORKLOAD KIND NAMESPACE
myapp-namespace default Namespace default
frontend default Deployment default
```

Doing so will keep the `frontend` deployment instrumented even if the
`myapp-namespace` Source is deleted.

### Excluding individual workloads from namespace instrumentation

Similar to the last example, a Source can exclude individual workloads
from namespace-wide instrumentation. This is done by setting
`disableInstrumentation: true` on the Source.

```yaml
apiVersion: odigos.io/v1alpha1
kind: Source
metadata:
name: backend-excluded
namespace: default
spec:
disableInstrumentation: true
workload:
name: backend
namespace: default
kind: Deployment
```

Now, if the namespace is instrumented, the backend deployment will not
be instrumented (as long as the Source is present).

## Working with Sources

When a Source object is created, Odigos will automatically apply
labels to that Source that mirror the `workload` field. For example:

```yaml
apiVersion: odigos.io/v1alpha1
kind: Source
metadata:
name: backend
namespace: default
labels:
odigos.io/workload-name: backend # added automatically
odigos.io/workload-namespace: default # added automatically
odigos.io/workload-kind: Deployment # added automatically
spec:
workload:
name: backend
namespace: default
kind: Deployment
```

This is intended to provide flexible management of Sources using
native Kubernetes LabelSelectors:

```
$ kubectl get sources -l odigos.io/workload-kind=Deployment
NAME WORKLOAD KIND NAMESPACE
frontend frontend Deployment default
backend backend Deployment default
```

This approach was chosen to support older versions of Kubernetes that
do not yet have default support for [selectable
fields](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#crd-selectable-fields).

As mentioned earlier, keep in mind that Sources are immutable and must
exist 1:1 with the workload or namespace they instrument. Odigos will
validate both of these on creation and any attempts to update existing
Sources.

## Continued support for label-based instrumentation

As mentioned above, the current support for label-based
instrumentation is convenient and we understand that some users may
prefer it. While we encourage users to migrate to Source objects, we
will continue to provide support for the current label-based approach
as a shorthand method for creating Sources.

Going forward, when Odigos finds a workload or namespace that is
labeled with `odigos-instrumentation=enabled`, the backend will automatically
create a matching Source object.

This operation is idempotent, meaning that re-creating the same
workload with `odigos-instrumentation=enabled` will not create a new Source
object if one already exists for that workload.

Similarly, labeling a workload or namespace with
`odigos-instrumentat=disabled` will create a Source for that workload
with `disableInstrumentation: true`.

However, both of these actions will log a warning in the Instrumentor.

## Future plans

While Source objects today are functionally simply a way to enable or
disable instrumentation, in the future we plan to expand this object
to provide more functionality such as:

- LabelSelector-based workload selection
- Source-to-destination grouping configuration
- Instrumentation status conditions

With features like these in mind, we plan for Sources to be the main
user-facing Kubernetes resource within Odigos going forward.

Check out our [documentation on
Sources](https://docs.odigos.io/pipeline/sources/adding-sources) to
learn more.

0 comments on commit 1ebc0ed

Please sign in to comment.