Add blog post on Source CRDs

odigos-io · Jan 15, 2025 · 1ebc0ed · 1ebc0ed
1 parent c518ea3
commit 1ebc0ed
Showing 1 changed file with 298 additions and 0 deletions.
diff --git a/markdown/docs/source-crd.mdx b/markdown/docs/source-crd.mdx
@@ -0,0 +1,298 @@
+---
+pubDate: 'Jan 24 2024'
+title: 'The new Source of truth: introducing persistent, stateless instrumentation with Source objects'
+image: '/collectors_cover.png'
+category: 'Golang'
+description: 'Learn about Source resources in Odigos'
+tags: [golang, resource-management]
+authorImage: '/mike.jpg'
+author: Mike Dame
+metadata: application management
+---
+
+We are excited to announce a change to the way you control instrumented workloads with Odigos: Source objects.
+
+The new Source object is a Custom Resource that allows declarative
+control over which workloads and namespaces are instrumented in your
+cluster. This is an evolution from the current approach in which you
+enable or disable instrumentation by setting the
+`odigos-instrumentation=enabled` label directly on your workload.
+
+An example Source object looks like this:
+
+```yaml
+apiVersion: odigos.io/v1alpha1
+kind: Source
+metadata:
+  name: my-source
+  namespace: default
+spec:
+  workload:
+    name: my-app
+    namespace: default
+    kind: Deployment
+```
+
+With these objects, we can provide the same functionality and control that's offered today with several benefits, which we'll explain below.
+
+But first, a few things to note:
+
+- **Sources are namespaced objects**, created in the same namespace as
+    the workload they represent. This gives application teams the
+    independence to own their own instrumentation as well as providing
+    flexible RBAC control for system admin, DevOps, and SRE teams.
+- **Sources identify a workload by name, namespace, and kind.** This
+    allows Odigos to uniquely identify the workload to be
+    instrumented. In the future, we hope to add wider identification
+    capabilities such as LabelSelectors to allow you to define custom
+    application topologies.
+- **Sources are entirely decoupled from applications.** Instrumenting
+    a workload with Sources means that there is no change required to
+    the workload's definition. This allows Sources to persist beyond
+    the lifecycle of a workload, idempotently enabling workload
+    instrumentation even if it is deleted and re-deployed.
+- **Sources exist 1:1 with applications.** Only one Source can exist
+    for each workload (except in the case of namespace instrumentation
+    – more on that later).
+- **Source workload fields are immutable once created.** When a Source
+    is linked to a workload, it is linked to that workload forever
+    (until the Source is deleted).
+
+## Why this change?
+
+In Odigos today, instrumenting your applications is as simple as
+applying a label to a workload or namespace. So why make the switch to
+a resource?
+
+While the simplicity of instrumentation labels is certainly
+beneficial, this approach also has its drawbacks. When talking to
+users about this, some issues became clear:
+
+- Using a label on the workload creates a stateful relationship on the
+  workload's definition, meaning that if the workload is modified then
+  care must be taken to persist the label for instrumentation to
+  remain effective. This causes problems with continuous deployment
+  systems that might overwrite labels by default. It also means that
+  distributed teams need to be aware of their auto-instrumentation,
+  which shouldn't be a concern for them.
+- Relying on the label's value means that our backend needs to not
+  only monitor the presence of the label, but also changes to its
+  value. This creates an edge-based system that is difficult for
+  Kubernetes operators to resiliently consume at scale.
+- Kubernetes labels are intended for identifying and grouping
+  workloads, not as a control mechanism. In contrast, CRDs are a
+  common config pattern for users to interact with Kubernetes
+  operators.
+
+Switching to a resource-based control mechanism addresses these by
+leveraging idiomatic Kubernetes operator patterns to reconcile the
+state of the cluster in a truly level-based design. It creates a
+user-facing config surface to declare the desired state of the cluster
+through a persistent and reliable Source of truth.
+
+With Sources, you are declaring that a workload (or namespace) should
+be instrumented, *forever*, as long as that Source exists. This means
+the workload can be deleted, re-deployed, or modified in any way and
+Odigos will still continue to recognize it.
+
+This decoupled persistence is critical to the role of Odigos as a
+low-overhead, non-invasive tool for auto-instrumentation.
+
+## How to use sources?
+
+Functionally, Source objects provide full feature parity with the
+auto-instrumentation controls in Odigos today. That includes:
+
+- Instrumenting individual workloads
+- Instrumenting entire namespaces
+- Excluding individual workloads from namespace instrumentation
+
+### Instrumenting individual workloads
+
+Say you have a simple app consisting of two deployments: `frontend`
+and `backend`. To instrument both of these, you will create two Source
+objects:
+
+```yaml
+apiVersion: odigos.io/v1alpha1
+kind: Source
+metadata:
+  name: frontend
+  namespace: default
+spec:
+  workload:
+    name: frontend
+    namespace: default
+    kind: Deployment
+---
+apiVersion: odigos.io/v1alpha1
+kind: Source
+metadata:
+  name: backend
+  namespace: default
+spec:
+  workload:
+    name: backend
+    namespace: default
+    kind: Deployment
+```
+
+When these objects are created, Odigos will find these workloads,
+detect their runtimes, and automatically instrument them just as it
+does today.
+
+You can confirm that these Sources were created with `kubectl`:
+
+```
+$ kubectl get sources
+NAME		  WORKLOAD	KIND		    NAMESPACE
+frontend	frontend	Deployment	default
+backend	  backend	  Deployment	default
+```
+
+### Instrumenting entire namespaces
+
+Instrumenting an entire namespace is similar to instrumenting
+individual workloads, except it only requires one Source object for
+the entire namespace. Following the above example, that would be:
+
+```yaml
+apiVersion: odigos.io/v1alpha1
+kind: Source
+metadata:
+  name: myapp-namespace
+  namespace: default
+spec:
+  workload:
+    name: default
+    namespace: default
+    kind: Namespace
+```
+
+In this case, a Source object must set kind: Namespace with the name
+and namespace equal to the namespace to instrument. This will
+instrument all of the possible workloads in the namespace under the
+single Source:
+
+```
+$ kubectl get sources
+NAME			      WORKLOAD	KIND		  NAMESPACE
+myapp-namespace	default	  Namespace	default
+```
+
+You can also create an individual workload Source in the same
+namespace. For example, you could also create the above frontend
+Source, and the list for the namespace would now look like this:
+
+```
+$ kubectl get sources
+NAME		      	WORKLOAD	KIND		    NAMESPACE
+myapp-namespace	default	  Namespace	  default
+frontend	    	default	  Deployment	default
+```
+
+Doing so will keep the `frontend` deployment instrumented even if the
+`myapp-namespace` Source is deleted.
+
+### Excluding individual workloads from namespace instrumentation
+
+Similar to the last example, a Source can exclude individual workloads
+from namespace-wide instrumentation. This is done by setting
+`disableInstrumentation: true` on the Source.
+
+```yaml
+apiVersion: odigos.io/v1alpha1
+kind: Source
+metadata:
+  name: backend-excluded
+  namespace: default
+spec:
+  disableInstrumentation: true
+  workload:
+    name: backend
+    namespace: default
+    kind: Deployment
+```
+
+Now, if the namespace is instrumented, the backend deployment will not
+be instrumented (as long as the Source is present).
+
+## Working with Sources
+
+When a Source object is created, Odigos will automatically apply
+labels to that Source that mirror the `workload` field. For example:
+
+```yaml
+apiVersion: odigos.io/v1alpha1
+kind: Source
+metadata:
+  name: backend
+  namespace: default
+  labels:
+    odigos.io/workload-name: backend # added automatically
+    odigos.io/workload-namespace: default # added automatically
+    odigos.io/workload-kind: Deployment # added automatically
+spec:
+  workload:
+    name: backend
+    namespace: default
+    kind: Deployment
+```
+
+This is intended to provide flexible management of Sources using
+native Kubernetes LabelSelectors:
+
+```
+$ kubectl get sources -l odigos.io/workload-kind=Deployment
+NAME		  WORKLOAD	KIND		    NAMESPACE
+frontend	frontend	Deployment	default
+backend	  backend	  Deployment	default
+```
+
+This approach was chosen to support older versions of Kubernetes that
+do not yet have default support for [selectable
+fields](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#crd-selectable-fields).
+
+As mentioned earlier, keep in mind that Sources are immutable and must
+exist 1:1 with the workload or namespace they instrument. Odigos will
+validate both of these on creation and any attempts to update existing
+Sources.
+
+## Continued support for label-based instrumentation
+
+As mentioned above, the current support for label-based
+instrumentation is convenient and we understand that some users may
+prefer it. While we encourage users to migrate to Source objects, we
+will continue to provide support for the current label-based approach
+as a shorthand method for creating Sources.
+
+Going forward, when Odigos finds a workload or namespace that is
+labeled with `odigos-instrumentation=enabled`, the backend will automatically
+create a matching Source object.
+
+This operation is idempotent, meaning that re-creating the same
+workload with `odigos-instrumentation=enabled` will not create a new Source
+object if one already exists for that workload.
+
+Similarly, labeling a workload or namespace with
+`odigos-instrumentat=disabled` will create a Source for that workload
+with `disableInstrumentation: true`.
+
+However, both of these actions will log a warning in the Instrumentor.
+
+## Future plans
+
+While Source objects today are functionally simply a way to enable or
+disable instrumentation, in the future we plan to expand this object
+to provide more functionality such as:
+
+- LabelSelector-based workload selection
+- Source-to-destination grouping configuration
+- Instrumentation status conditions
+
+With features like these in mind, we plan for Sources to be the main
+user-facing Kubernetes resource within Odigos going forward.
+
+Check out our [documentation on
+Sources](https://docs.odigos.io/pipeline/sources/adding-sources) to
+learn more.