Skip to content

Commit

Permalink
adding node feature discovery operator
Browse files Browse the repository at this point in the history
  • Loading branch information
Andrew Sheet committed Mar 26, 2024
1 parent f77192a commit f4d61ed
Show file tree
Hide file tree
Showing 19 changed files with 367 additions and 0 deletions.
16 changes: 16 additions & 0 deletions components/operators/nfd/INFO.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# nfd

The Node Feature Discovery Operator manages the detection of hardware features and configuration in a Kubernetes cluster by labeling the nodes with hardware-specific information. The Node Feature Discovery (NFD) will label the host with node-specific attributes, like PCI cards, kernel, or OS version, and many more.

NFD consists of the following software components:

The NFD Operator is based on the Operator Framework an open source toolkit to manage Kubernetes native applications, called Operators, in an effective, automated, and scalable way.

##NFD-Master
NFD-Master is the daemon responsible for communication towards the Kubernetes API. That is, it receives labeling requests from the worker and modifies node objects accordingly.

##NFD-Worker
NFD-Worker is a daemon responsible for feature detection. It then communicates the information to nfd-master which does the actual node labeling. One instance of nfd-worker is supposed to be running on each node of the cluster.

##NFD-Topology-Updater
NFD-Topology-Updater is a daemon responsible for examining allocated resources on a worker node to account for resources available to be allocated to new pod on a per-zone basis (where a zone can be a NUMA node). It then communicates the information to nfd-master which does the NodeResourceTopology CR creation corresponding to all the nodes in the cluster. One instance of nfd-topology-updater is supposed to be running on each node of the cluster.
32 changes: 32 additions & 0 deletions components/operators/nfd/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Node Feature Discovery Operator

Install Node Feature Discovery Operator.

Do not use the `base` directory directly, as you will need to patch the `channel` based on the version of OpenShift you are using, or the version of the operator you want to use.

The current *overlays* available are for the following channels:

* [stable](operator/overlays/stable)

## Usage

If you have cloned the `gitops-catalog` repository, you can install Node Feature Discovery Operator based on the overlay of your choice by running from the root (`gitops-catalog`) directory.

```
oc apply -k nfd/operator/overlays/<channel>
```

Or, without cloning:

```
oc apply -k https://github.com/redhat-cop/gitops-catalog/nfd/operator/overlays/<channel>
```

As part of a different overlay in your own GitOps repo:

```
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- https://github.com/redhat-cop/gitops-catalog/nfd/operator/overlays/<channel>?ref=main
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

commonAnnotations:
argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=true

namespace: openshift-nfd

resources:
- ../../../operator/overlays/stable
- ../../../instance/overlays/default
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

commonAnnotations:
argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=true

namespace: openshift-nfd

resources:
- ../../../operator/overlays/stable
- ../../../instance/overlays/kata
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

commonAnnotations:
argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=true

namespace: openshift-nfd

resources:
- ../../../operator/overlays/stable
- ../../../instance/overlays/only-nvidia
42 changes: 42 additions & 0 deletions components/operators/nfd/instance/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# OpenShift Node Feature Discovery (NFD)

Installs a basic nodeFeatureDiscovery instance.

## Prerequisites

First, install the [OpenShift NFD Operator](../operator) in your cluster.

Do not use the `base` directory directly, as you will need to patch the `channel` based on the version of OpenShift you are using, or the version of the operator you want to use.

## Overlays

The options for this operator are the following *overlays*:
* [default](overlays/default)

### Default

[default](overlays/default) configures a basic default configuration for a nodeFeatureDiscovery instance. For more details on customizing the NFD workers, refer to the [docs](https://kubernetes-sigs.github.io/node-feature-discovery/v0.10/advanced/worker-configuration-reference.html).

## Usage

If you have cloned the `gitops-catalog` repository, you can install the Storage System by running from the root `gitops-catalog` directory

```
oc apply -k openshift-nfd-operator/instance/overlays/default
```

Or, without cloning:

```
oc apply -k https://github.com/redhat-cop/gitops-catalog/openshift-nfd-operator/instance/overlays/default
```

As part of a different overlay in your own GitOps repo:

```
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- github.com/redhat-cop/gitops-catalog/openshift-nfd-operator/instance/overlays/default?ref=main
```
7 changes: 7 additions & 0 deletions components/operators/nfd/instance/base/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

namespace: openshift-nfd

resources:
- node-feature-discovery.yaml
127 changes: 127 additions & 0 deletions components/operators/nfd/instance/base/node-feature-discovery.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
kind: NodeFeatureDiscovery
apiVersion: nfd.openshift.io/v1
metadata:
name: nfd-instance
spec:
customConfig:
configData: |
# - name: "more.kernel.features"
# matchOn:
# - loadedKMod: ["example_kmod3"]
# - name: "more.features.by.nodename"
# value: customValue
# matchOn:
# - nodename: ["special-.*-node-.*"]
operand:
# bug: an image has to be defined otherwise the deployment fails
# bug: this behavior recently changed
image: registry.redhat.io/openshift4/ose-node-feature-discovery:latest
servicePort: 12000
workerConfig:
configData: |
core:
# labelWhiteList:
# noPublish: false
sleepInterval: 60s
# sources: [all]
# klog:
# addDirHeader: false
# alsologtostderr: false
# logBacktraceAt:
# logtostderr: true
# skipHeaders: false
# stderrthreshold: 2
# v: 0
# vmodule:
## NOTE: the following options are not dynamically run-time
## configurable and require a nfd-worker restart to take effect
## after being changed
# logDir:
# logFile:
# logFileMaxSize: 1800
# skipLogHeaders: false
sources:
# cpu:
# cpuid:
## NOTE: whitelist has priority over blacklist
# attributeBlacklist:
# - "BMI1"
# - "BMI2"
# - "CLMUL"
# - "CMOV"
# - "CX16"
# - "ERMS"
# - "F16C"
# - "HTT"
# - "LZCNT"
# - "MMX"
# - "MMXEXT"
# - "NX"
# - "POPCNT"
# - "RDRAND"
# - "RDSEED"
# - "RDTSCP"
# - "SGX"
# - "SSE"
# - "SSE2"
# - "SSE3"
# - "SSE4.1"
# - "SSE4.2"
# - "SSSE3"
# attributeWhitelist:
# kernel:
# kconfigFile: "/path/to/kconfig"
# configOpts:
# - "NO_HZ"
# - "X86"
# - "DMI"
pci:
deviceClassWhitelist:
- "0200"
- "03"
- "12"
deviceLabelFields:
# - "class"
- "vendor"
# - "device"
# - "subsystem_vendor"
# - "subsystem_device"
# usb:
# deviceClassWhitelist:
# - "0e"
# - "ef"
# - "fe"
# - "ff"
# deviceLabelFields:
# - "class"
# - "vendor"
# - "device"
# custom:
# - name: "my.kernel.feature"
# matchOn:
# - loadedKMod: ["example_kmod1", "example_kmod2"]
# - name: "my.pci.feature"
# matchOn:
# - pciId:
# class: ["0200"]
# vendor: ["15b3"]
# device: ["1014", "1017"]
# - pciId :
# vendor: ["8086"]
# device: ["1000", "1100"]
# - name: "my.usb.feature"
# matchOn:
# - usbId:
# class: ["ff"]
# vendor: ["03e7"]
# device: ["2485"]
# - usbId:
# class: ["fe"]
# vendor: ["1a6e"]
# device: ["089a"]
# - name: "my.combined.feature"
# matchOn:
# - pciId:
# vendor: ["15b3"]
# device: ["1014", "1017"]
# loadedKMod : ["vendor_kmod1", "vendor_kmod2"]

Check failure on line 127 in components/operators/nfd/instance/base/node-feature-discovery.yaml

View workflow job for this annotation

GitHub Actions / lint-yaml

127:63 [new-line-at-end-of-file] no new line character at the end of file

Check failure on line 127 in components/operators/nfd/instance/base/node-feature-discovery.yaml

View workflow job for this annotation

GitHub Actions / lint-yaml

127:63 [new-line-at-end-of-file] no new line character at the end of file
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
- ../../base
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
- ../../base
- nfd.yaml
20 changes: 20 additions & 0 deletions components/operators/nfd/instance/overlays/kata/nfd.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
apiVersion: nfd.openshift.io/v1
kind: NodeFeatureDiscovery
metadata:
name: nfd-kata
namespace: openshift-nfd
spec:
operand:
image: quay.io/openshift/origin-node-feature-discovery:4.12
imagePullPolicy: Always
servicePort: 12000
workerConfig:
configData: |
sources:
custom:
- name: "feature.node.kubernetes.io/runtime.kata"
matchOn:
- cpuId: ["SSE4", "VMX"]
loadedKMod: ["kvm", "kvm_intel"]
- cpuId: ["SSE4", "SVM"]
loadedKMod: ["kvm", "kvm_amd"]
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
- ../../base

patches:
- target:
group: nfd.openshift.io
kind: NodeFeatureDiscovery
path: patch-node-feature-discovery.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
- op: add
path: /spec
value:
instance: ''
operand:
image: registry.redhat.io/openshift4/ose-node-feature-discovery:latest
servicePort: 12000
topologyUpdater: false
workerConfig:
configData: |
core:
sleepInterval: 60s
sources:
pci:
deviceClassWhitelist:
- "0200"
- "03"
- "12"
deviceLabelFields:
- "vendor"
7 changes: 7 additions & 0 deletions components/operators/nfd/operator/base/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
- namespace.yaml
- operator-group.yaml
- subscription.yaml
8 changes: 8 additions & 0 deletions components/operators/nfd/operator/base/namespace.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
apiVersion: v1
kind: Namespace
metadata:
annotations:
openshift.io/display-name: "Node Feature Discovery Operator"
labels:
openshift.io/cluster-monitoring: 'true'
name: openshift-nfd
8 changes: 8 additions & 0 deletions components/operators/nfd/operator/base/operator-group.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: nfd
namespace: openshift-nfd
spec:
targetNamespaces:
- openshift-nfd
11 changes: 11 additions & 0 deletions components/operators/nfd/operator/base/subscription.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: nfd
namespace: openshift-nfd
spec:
channel: patch-me-see-overlays-dir
installPlanApproval: Automatic
name: nfd
source: redhat-operators
sourceNamespace: openshift-marketplace
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
- ../../base

patches:
- target:
kind: Subscription
name: nfd
path: patch-channel.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
- op: replace
path: /spec/channel
value: stable

0 comments on commit f4d61ed

Please sign in to comment.