Skip to content

Commit

Permalink
Improve configuration specification
Browse files Browse the repository at this point in the history
* extract the "owners" attribute to a more descriptive configuration option
* use consistent naming across fields
* update docs and describe how to use fields in more detail
* small fixes
  • Loading branch information
pmm-sumo committed Mar 25, 2020
1 parent 030e6dc commit bf8c31d
Show file tree
Hide file tree
Showing 11 changed files with 157 additions and 135 deletions.
111 changes: 73 additions & 38 deletions processor/k8sprocessor/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,27 +15,62 @@ There are several top level sections of the processor config:
- `passthrough` (default = false): when set to true, only annotates resources with the pod IP and
does not try to extract any other metadata. It does not need access to the K8S cluster API.
Agent/Collector must receive spans directly from services to be able to correctly detect the pod IPs.
- `pod_ip_debugging` (default = false): when set to true, enables verbose logs that help
with verification how the Pod IP is being assigned when doing metadata tagging
- `owner_lookup_enabled` (default = false): when set to true, fields such as `daemonSetName`,
`replicaSetName`, etc. can be extracted, though it requires additional Kubernetes API calls to traverse
the `owner` relationship. See the [list of fields](#k8sprocessor-extract) for more information over
which tags require the flag to be enabled.
- `extract`: the section (see [below](#k8sprocessor-extract)) allows specifying extraction rules
- `filter`: the section (see [below](#k8sprocessor-filter)) allows specifying filters when matching pods

#### <a name="k8sprocessor-extract"></a>Extract section

Allows specifying extraction rules to extract data from k8s pod specs.

- `metadata` (default = empty): specifies a list of strings that denote extracted fields. See
[example config](#k8sprocessor-example) for the list of fields.
*Note: `owners` is a special field which enables traversing the ownership tree to pull data such
as `deploymentSetName`, `serviceName`, `daemonSetName`, `statefulSetName`, etc.)*
- `tags` (default = empty): specifies an optional map of custom tags to be used. When provided,
specified fields use provided names when being tagged, e.g.:
- `metadata` (default = empty): specifies a list of strings that denote extracted fields. Following fields
can be extracted:
- `containerId`
- `containerName`
- `containerImage`
- `clusterName`
- `daemonSetName` _(`owner_lookup_enabled` must be set to `true`)_
- `deploymentName`
- `hostName`
- `namespace`
- `namespaceId` _(`owner_lookup_enabled` must be set to `true`)_
- `nodeName`
- `podId`
- `podName`
- `replicaSetName` _(`owner_lookup_enabled` must be set to `true`)_
- `serviceName` _(`owner_lookup_enabled` must be set to `true`)_
- `startTime`
- `statefulSetName` _(`owner_lookup_enabled` must be set to `true`)_

Also, see [example config](#k8sprocessor-example).
- `tags`: specifies an optional map of custom tag names to be used. By default, following names are being assigned:
- `clusterName `: `k8s.cluster.name`
- `containerID `: `k8s.container.id`
- `containerImage `: `k8s.container.image`
- `containerName `: `k8s.container.name`
- `daemonSetName `: `k8s.daemonset.name`
- `deploymentName `: `k8s.deployment.name`
- `hostName `: `k8s.pod.hostname`
- `namespaceName `: `k8s.namespace.name`
- `namespaceID `: `k8s.namespace.id`
- `nodeName `: `k8s.node.name`
- `podID `: `k8s.pod.id`
- `podName `: `k8s.pod.name`
- `replicaSetName `: `k8s.replicaset.name`
- `serviceName `: `k8s.service.name`
- `statefulSetName`: `k8s.statefulset.name`
- `startTime `: `k8s.pod.startTime`

When custom value is specified, specified fields use provided names when being tagged, e.g.:
```yaml
tags:
containerId: my-custom-tag-for-container
node: kubernetes.node
containerId: my-custom-tag-for-container-id
nodeName: node_name
```
- `annotations` (default = empty): a list of rules for extraction and recording annotation data.
- `annotations` (default = empty): a list of rules for extraction and recording annotation data.
See [field extract config](#k8sprocessor-field-extract) for an example on how to use it.
- `labels` (default = empty): a list of rules for extraction and recording label data.
See [field extract config](#k8sprocessor-field-extract) for an example on how to use it.
Expand Down Expand Up @@ -136,58 +171,58 @@ pods by generic k8s pod labels. Only the following operations (`op`) are support
processors:
k8s_tagger:
passthrough: false
owner_lookup_enabled: true # To enable fetching additional metadata using `owner` relationship
extract:
metadata:
# extract the following well-known metadata fields
- containerId
- containerName
- containerImage
- cluster
- clusterName
- daemonSetName
- deployment
- deploymentName
- hostName
- namespace
- namespaceId
- node
- owners
- nodeName
- podId
- podName
- replicaSetName
- serviceName
- startTime
- statefulSetName
tags:
# It is possible to provide your custom key names for each of the extracted metadata:
containerId: k8s.pod.containerId
# It is possible to provide your custom key names for each of the extracted metadata fields,
# e.g. to store podName as "pod_name" rather than the default "k8s.pod.name", use following:
podName: pod_name

annotations:
# Extract all annotations using a template
- tag_name: k8s.annotation.%s
key: "*"
labels:
# Extract all labels using a template
- tag_name: k8s.label.%s
key: "*"
- tag_name: l1 # extracts value of label with key `label1` and inserts it as a tag with key `l1`
key: label1
- tag_name: l2 # extracts value of label with key `label1` with regexp and inserts it as a tag with key `l2`
key: label2
regex: field=(?P<value>.+)

filter:
# The pods might be filtered, just uncomment the relevant section and
# fill it with actual value, e.g.:
#
# namespace: ns2 # only look for pods running in ns2 namespace
# node: ip-111.us-west-2.compute.internal # only look for pods running on this node/host
# node_from_env_var: K8S_NODE # only look for pods running on the node/host specified by the K8S_NODE environment variable
# labels: # only consider pods that match the following labels
# - key: key1 # match pods that have a label `key1=value1`. `op` defaults to "equals" when not specified
# value: value1
# - key: key2 # ignore pods that have a label `key2=value2`.
# value: value2
# op: not-equals
# fields: # works the same way as labels but for fields instead (like annotations)
# - key: key1
# value: value1
# - key: key2
# value: value2
# op: not-equals
namespace: ns2 # only look for pods running in ns2 namespace
node: ip-111.us-west-2.compute.internal # only look for pods running on this node/host
node_from_env_var: K8S_NODE # only look for pods running on the node/host specified by the K8S_NODE environment variable
labels: # only consider pods that match the following labels
- key: key1 # match pods that have a label `key1=value1`. `op` defaults to "equals" when not specified
value: value1
- key: key2 # ignore pods that have a label `key2=value2`.
value: value2
op: not-equals
fields: # works the same way as labels but for fields instead (like annotations)
- key: key1
value: value1
- key: key2
value: value2
op: not-equals
```
### RBAC
Expand Down
6 changes: 3 additions & 3 deletions processor/k8sprocessor/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,9 @@ type Config struct {
// directly from services to be able to correctly detect the pod IPs.
Passthrough bool `mapstructure:"passthrough"`

// PodIPDebugging enables verbose logs so it could be verified
// how the Pod IP is being assigned when doing metadata tagging
PodIPDebugging bool `mapstructure:"pod_ip_debugging"`
// OwnerLookupEnabled enables pulling owner data, which triggers
// additional calls to Kubernetes API
OwnerLookupEnabled bool `mapstructure:"owner_lookup_enabled"`

// Extract section allows specifying extraction rules to extract
// data from k8s pod specs
Expand Down
7 changes: 4 additions & 3 deletions processor/k8sprocessor/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -59,11 +59,12 @@ func TestLoadConfig(t *testing.T) {
TypeVal: "k8s_tagger",
NameVal: "k8s_tagger/2",
},
Passthrough: false,
Passthrough: false,
OwnerLookupEnabled: true,
Extract: ExtractConfig{
Metadata: []string{
"containerId", "containerName", "containerImage", "cluster", "daemonSetName",
"deployment", "hostName", "namespace", "namespaceId", "node", "owners", "podId",
"containerId", "containerName", "containerImage", "clusterName", "daemonSetName",
"deploymentName", "hostName", "namespace", "namespaceId", "nodeName", "podId",
"podName", "replicaSetName", "serviceName", "startTime", "statefulSetName",
},
Tags: map[string]string{
Expand Down
8 changes: 4 additions & 4 deletions processor/k8sprocessor/factory.go
Original file line number Diff line number Diff line change
Expand Up @@ -62,16 +62,16 @@ func (f *Factory) CreateTraceProcessor(
opts = append(opts, WithPassthrough())
}

if oCfg.PodIPDebugging {
opts = append(opts, WithPodIPDebugging())
}

// extraction rules
opts = append(opts, WithExtractMetadata(oCfg.Extract.Metadata...))
opts = append(opts, WithExtractLabels(oCfg.Extract.Labels...))
opts = append(opts, WithExtractAnnotations(oCfg.Extract.Annotations...))
opts = append(opts, WithExtractTags(oCfg.Extract.Tags))

if oCfg.OwnerLookupEnabled {
opts = append(opts, WithOwnerLookupEnabled())
}

// filters
opts = append(opts, WithFilterNode(oCfg.Filter.Node, oCfg.Filter.NodeFromEnvVar))
opts = append(opts, WithFilterNamespace(oCfg.Filter.Namespace))
Expand Down
42 changes: 22 additions & 20 deletions processor/k8sprocessor/kube/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -72,13 +72,15 @@ func New(logger *zap.Logger, rules ExtractionRules, filters Filters, newClientSe
}
c.kc = kc

if newOwnerProviderFunc == nil {
newOwnerProviderFunc = newOwnerProvider
}
if c.Rules.OwnerLookupEnabled {
if newOwnerProviderFunc == nil {
newOwnerProviderFunc = newOwnerProvider
}

c.op, err = newOwnerProviderFunc(logger, kc, !limitsPodScope(filters))
if err != nil {
return nil, err
c.op, err = newOwnerProviderFunc(logger, kc, !shouldWarmCache(filters))
if err != nil {
return nil, err
}
}

labelSelector, fieldSelector, err := selectorsFromFilters(c.Filters)
Expand Down Expand Up @@ -209,11 +211,11 @@ func (c *WatchClient) extractPodAttributes(pod *api_v1.Pod) map[string]string {
}
}

if c.Rules.Deployment {
if c.Rules.DeploymentName {
// format: [deployment-name]-[Random-String-For-ReplicaSet]-[Random-String-For-Pod]
parts := c.deploymentRegex.FindStringSubmatch(pod.Name)
if len(parts) == 2 {
tags[c.Rules.Tags.Deployment] = parts[1]
tags[c.Rules.Tags.DeploymentName] = parts[1]
}
}

Expand All @@ -239,7 +241,7 @@ func (c *WatchClient) extractPodAttributes(pod *api_v1.Pod) map[string]string {
}
}

if c.Rules.Owners {
if c.Rules.OwnerLookupEnabled {
owners := c.op.GetOwners(pod)

for _, owner := range owners {
Expand All @@ -248,7 +250,7 @@ func (c *WatchClient) extractPodAttributes(pod *api_v1.Pod) map[string]string {
if c.Rules.DaemonSetName {
tags[c.Rules.Tags.DaemonSetName] = owner.name
}
case "Deployment":
case "DeploymentName":
// This should be already set earlier
case "ReplicaSet":
if c.Rules.ReplicaSetName {
Expand All @@ -266,6 +268,13 @@ func (c *WatchClient) extractPodAttributes(pod *api_v1.Pod) map[string]string {
// Do nothing
}
}

if c.Rules.NamespaceID {
ns := c.op.GetNamespace(pod.Namespace)
if ns != nil {
tags[c.Rules.Tags.NamespaceID] = string(ns.UID)
}
}
}

if len(pod.Status.ContainerStatuses) > 0 {
Expand All @@ -290,13 +299,6 @@ func (c *WatchClient) extractPodAttributes(pod *api_v1.Pod) map[string]string {
tags[c.Rules.Tags.PodID] = string(pod.UID)
}

if c.Rules.NamespaceID {
ns := c.op.GetNamespace(pod.Namespace)
if ns != nil {
tags[c.Rules.Tags.NamespaceID] = string(ns.UID)
}
}

for _, r := range c.Rules.Labels {
if r.Key == "*" {
// Special case, extract everything
Expand Down Expand Up @@ -415,9 +417,9 @@ func (c *WatchClient) shouldIgnorePod(pod *api_v1.Pod) bool {
return false
}

// limitsPodScope check if there are filters applied; if this is the case, then pod scope is being
// limited and it is better to not do cache warmup and rely on lazy lookups
func limitsPodScope(filters Filters) bool {
// shouldWarmCache check if there are filters applied; if this is the case, then pod scope is being
// limited and it is better to not do cache warmup and instead rely on just the lazy lookups
func shouldWarmCache(filters Filters) bool {
if len(filters.Labels) > 0 {
return true
}
Expand Down
49 changes: 24 additions & 25 deletions processor/k8sprocessor/kube/client_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,8 @@ func TestNoHostnameExtractionRules(t *testing.T) {
}

func TestExtractionRules(t *testing.T) {
c := newTestClientWithRulesAndFilters(t, ExtractionRules{}, Filters{})
// OwnerLookupEnabled is set to true so the newOwnerProviderFunc can be called in the initializer
c := newTestClientWithRulesAndFilters(t, ExtractionRules{OwnerLookupEnabled: true}, Filters{})

pod := &api_v1.Pod{
ObjectMeta: meta_v1.ObjectMeta{
Expand Down Expand Up @@ -206,33 +207,33 @@ func TestExtractionRules(t *testing.T) {
}, {
name: "deployment",
rules: ExtractionRules{
Deployment: true,
Tags: NewExtractionFieldTags(),
DeploymentName: true,
Tags: NewExtractionFieldTags(),
},
attributes: map[string]string{
"k8s.deployment.name": "auth-service",
},
}, {
name: "metadata",
rules: ExtractionRules{
ClusterName: true,
ContainerID: true,
ContainerImage: true,
ContainerName: true,
DaemonSetName: true,
Deployment: true,
HostName: true,
Owners: true,
PodID: true,
PodName: true,
ReplicaSetName: true,
ServiceName: true,
StatefulSetName: true,
StartTime: true,
Namespace: true,
NamespaceID: true,
NodeName: true,
Tags: NewExtractionFieldTags(),
ClusterName: true,
ContainerID: true,
ContainerImage: true,
ContainerName: true,
DaemonSetName: true,
DeploymentName: true,
HostName: true,
PodID: true,
PodName: true,
ReplicaSetName: true,
ServiceName: true,
StatefulSetName: true,
StartTime: true,
Namespace: true,
NamespaceID: true,
NodeName: true,
OwnerLookupEnabled: true,
Tags: NewExtractionFieldTags(),
},
attributes: map[string]string{
"k8s.cluster.name": "cluster1",
Expand All @@ -257,9 +258,8 @@ func TestExtractionRules(t *testing.T) {
ContainerImage: false,
ContainerName: true,
DaemonSetName: false,
Deployment: false,
DeploymentName: false,
HostName: false,
Owners: false,
PodID: false,
PodName: false,
ReplicaSetName: false,
Expand Down Expand Up @@ -498,9 +498,8 @@ func newBenchmarkClient(b *testing.B) *WatchClient {
ContainerImage: true,
ContainerName: true,
DaemonSetName: true,
Deployment: true,
DeploymentName: true,
HostName: true,
Owners: true,
PodID: true,
PodName: true,
ReplicaSetName: true,
Expand Down
Loading

0 comments on commit bf8c31d

Please sign in to comment.