Table of Contents
- Network interface to vars table
- Override lab ocpinventory json file
- DU Profile for SNOs
- Post Deployment Tasks
- Updating the OCP version
- Add/delete contents to the bastion registry
- Using other network interfaces
- Configuring NVMe install and etcd disks
Values here reflect the default (Network 1 which maps to controlplane_network_interface_idx: 0
). See this section to generate the proper inventory for a different network.
Scale Lab
Hardware | bastion_lab_interface | bastion_controlplane_interface | controlplane_lab_interface |
---|---|---|---|
Dell r650 | eno12399np0 | ens1f0 | eno12399np0 |
Dell r640 | eno1np0 | ens1f0 | eno1np0 |
Dell fc640 | eno1 | eno2 | eno1 |
Supermicro 1029p | eno1 | ens2f0 | eno1 |
Supermicro 5039ms | enp2s0f0 | enp1s0f0 | enp2s0f0 |
Scale lab chart is available here.
Alias Lab
Hardware | bastion_lab_interface | bastion_controlplane_interface | controlplane_lab_interface |
---|---|---|---|
Dell r750 | eno8303 | ens3f0 | eno8303 |
Dell r740xd | eno3 | eno1 | eno1np0 |
Alias lab chart is available here.
By default Jetlag selects machines for the roles bastion, control-plane, and worker in that order from the ocpinventory.json file. You can create a new json file with the desired order to match desired roles if the auto selection is incorrect. After creating a new json file, host this where your machine running the playbooks can reach and set the following var such that the modified ocpinventory json file is used:
ocp_inventory_override: http://<http-server>/<inventory-file>.json
Use var du_profile
to apply the DU specific machine configurations to your SNOs. You must also define reserved_cpus
and isolated_cpus
when applying DU profile. Append these vars to the "Extra vars" section of your all.yml
or ibmcloud.yml
.
Example settings:
du_profile: true
# The reserved and isolated CPU pools must not overlap and together must span all available cores in the worker node.
reserved_cpus: 0-1,40-41
isolated_cpus: 2-39,42-79
As a result, the following machine configuration files will be added to the cluster during SNO install:
- 01-container-mount-ns-and-kubelet-conf-master.yaml
- 03-sctp-machine-config-master.yaml
- 04-accelerated-container-startup-master.yaml
- 05-kdump-config-master (when kdump is enabled)
- 99-crio-disable-wipe-master
- 99-master-workload-partitioning.yml
- enable-crun-master.yaml
When deploying DU profile on OCP 4.13 or higher, composable openshift feature will automatically be deployed and as a result, all unnecessary optional Cluster Operators will not be deployed.
In addition to this, Network Diagnostics will be disabled, monitoring footprint will be reduced, performance-profile and tunedPerformancePatch will be applied post SNO install (based on input vars defined - See SNO DU Profile section under Post Deployment Tasks).
Refer to https://github.com/openshift-kni/cnf-features-deploy/tree/master/ztp/source-crs for config details.
About Reserved CPUs
Setting reserved_cpus
would allow us to isolate the control plane services to run on a restricted set of CPUs.
You can reserve cores, or threads, for operating system housekeeping tasks from a single NUMA node and put your workloads on another NUMA node. The reason for this is that the housekeeping processes might be using the CPUs in a way that would impact latency sensitive processes running on those same CPUs. Keeping your workloads on a separate NUMA node prevents the processes from interfering with each other. Additionally, each NUMA node has its own memory bus that is not shared.
If you are unsure about which cpus to reserve for housekeeping-pods, the general rule is to identify any two processors and their siblings on separate NUMA nodes:
# lscpu -e | head -n1
CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE MAXMHZ MINMHZ
# lscpu -e | egrep "0:0:0:0|1:1:1:1"
0 0 0 0 0:0:0:0 yes 3900.0000 800.0000
1 1 1 1 1:1:1:1 yes 3900.0000 800.0000
40 0 0 0 0:0:0:0 yes 3900.0000 800.0000
41 1 1 1 1:1:1:1 yes 3900.0000 800.0000
Append these vars to the "Extra vars" section of your all.yml
or ibmcloud.yml
to add a Macvlan Network Attachment
Definition. This allows you to add an additional network to pods created in your cluster.
setup_network_attachment_definition: true
net_attach_def_namespace: default
net_attach_def_name: net1
net_attach_def_interface: bond0
net_attach_def_range: 192.168.0.0/16
Modify net_attach_def_interface
to the desired host interface in which you want the macvlan network to exist. Modify
net_attach_def_range
to an ip range that does not conflict with any other test-bed address ranges.
To have a pod attach an interface to the additional network, add the following example metadata annotation:
annotations:
k8s.v1.cni.cncf.io/networks: '[{"name": "net1", "namespace": "default"}]'
The following vars are relevant to performance profile creation post SNO install:
# Required vars
du_profile: true
# The reserved and isolated CPU pools must not overlap and together must span all available cores in the worker node.
reserved_cpus: 0-1,48-49
isolated_cpus: 2-47,50-95
#Optional vars
# Number of hugepages of size 1G to be allocated on the SNO
hugepages_count: 16
After performance-profile is applied, the standard TunedPerformancePatch used for SNO DUs will also be applied post SNO install if DU profile is enabled. This profile will disable chronyd service and enable stalld, change the FIFO priority of ice-ptp processes to 10. Further changes applied can be found in the template 'tunedPerformancePatch.yml.j2' under sno-post-cluster-install templates.
Performance Addon Operator must be installed for the usage of performance-profile in versions older than OCP 4.11.
Append these vars to the "Extra vars" section of your all.yml
or ibmcloud.yml
to install Performance Addon Operator to allow for low latency node performance tunings on your OCP 4.9 or 4.10 SNO.
install_performance_addon_operator: true
Please Note
- Performance Addon Operator is not available in OCP 4.11 or higher. The PAO code was moved into the Node Tuning Operator in OCP 4.11
Versions are controlled by the release image. If you want to change images:
Modify the vars file to update release image path with ocp_release_image
and the openshift version with openshift_version
Example:
ocp_release_image: registry.ci.openshift.org/ocp/release:4.10.0-0.nightly-2022-01-18-044014
openshift_version: "4.10"
Ensure that your pull secrets are still valid.
When worikng with OCP development builds/nightly releases, it might be required to update your pull secret with fresh registry.ci.openshift.org
credentials as they are bound to expire after a definite period. Follow these steps to update your pull secret:
- Login to https://console-openshift-console.apps.ci.l2s4.p1.openshiftapps.com/ with your github id. You must be a member of Openshift Org to do this.
- Select Copy login command from the drop-down list under your account name
- Copy the oc login command and run it on your terminal
- Execute the command shown below to print out the pull secret:
(.ansible) [root@<bastion> jetlag]# oc registry login --to=-
- Append or update the pull secret retrieved from above under pull_secret.txt in repo base directory.
You must stop and remove all assisted-installer containers on the bastion with clean the pods and containers off the bastion and then rerun the setup-bastion step in order to setup your bastion's assisted-installer to the version you specified before deploying a fresh cluster with that version.
There might be use-cases when you want to add and delete images to/from the bastion registry. For example, for the single stack IPv6 disconnected deployment, the deployment cannot reach quay.io to get the image for your containers. In this situation, you may use the ICSP (ImageContentSecurityPolicy) mechanism in conjunction with image mirroring. When the deployment requests an image on quay.io, cri-o will intercept the request, redirect and map it to an image on the bastion/mirror registry. For example, this policy will map images on quay.io/XXX/client-server to the mirror registry on perf176b, the bastion of this IPv6 disconnected cluster.
apiVersion: operator.openshift.io/v1alpha1
kind: ImageContentSourcePolicy
metadata:
name: crucible-repo
spec:
repositoryDigestMirrors:
- mirrors:
- perf176b.xxx.com:5000/XXX/client-server
source: quay.io/XXX/client-server
For on-demand mirroring, the next command run on the bastion will mirror the image from quay.io to perf176b's disconnected registry.
(.ansible) [root@<bastion> jetlag]# oc image mirror -a /opt/registry/pull-secret-bastion.txt perf176b.xxx.com:5000/XXX/client-server:<tag> --keep-manifest-list --continue-on-error=true
Once the image has successfully mirrored onto the disconnected registry, your deployment will be able to create the container.
For image deletion, use the Docker V2 REST API to delete the object. Note that the deletion operation argument has to be an image's digest not image's tag. So if you mirrored your image by tag in the previous step, on deletion you have to get its digest first. The following is a convenient script that deletes an image by tag.
### script
#!/bin/bash
registry='[fc00:1000::1]:5000' <===== IPv6 address and port of perf176b disconnected registry
name='XXX/client-server'
auth='-u username:passwd'
function rm_XXX_tag {
ltag=$1
curl $auth -X DELETE -sI -k "https://${registry}/v2/${name}/manifests/$(
curl $auth -sI -k \
-H "Accept: application/vnd.oci.image.manifest.v1+json" \
"https://${registry}/v2/${name}/manifests/${ltag}" \
| tr -d '\r' | sed -En 's/^Docker-Content-Digest: (.*)/\1/pi'
)"
}
If you want to use a NIC other than the default, you need to override the controlplane_network_interface_idx
variable in the Extra vars
section of ansible/vars/all.yml
.
In this example using nic ens2f0
in a cluster of r650 nodes is shown.
- Select which NIC you want to use instead of the default, in this example,
ens2f0
. - Look for your server model number in your labs wiki page then select the network you want configured as your primary network using the following mapping:
Network | YAML variable |
---|---|
Network 1 | controlplane_network_interface_idx: 0 |
Network 2 | controlplane_network_interface_idx: 1 |
Network 3 | controlplane_network_interface_idx: 2 |
Network 4 | controlplane_network_interface_idx: 3 |
Network 5 | controlplane_network_interface_idx: 4 |
- Since the desired NIC in this example,
ens2f0
, is listed under the column "Network 3" the value 2 is correct. - Set 2 as the value of the variable
controlplane_network_interface_idx
inansible/vars/all.yaml
.
################################################################################
# Extra vars
################################################################################
# Append override vars below
controlplane_network_interface_idx: 2
In case you are bringing your own lab, set controlplane_network_interface
to the desired name, eg. controlplane_network_interface: ens2f0
.
If you require the install disk or etcd disk to be on a specific drive,
they can be specified directly through the vars file all.yml
.
To ensure the drive will be correctly mapped at each boot,
we will locate the /dev/disk/by-path
link to each drive.
# Locate names of the drives identified on your system
$ lsblk | grep nvme
nvme3n1 259:0 0 1.5T 0 disk
nvme2n1 259:1 0 1.5T 0 disk
nvme1n1 259:2 0 1.5T 0 disk
nvme0n1 259:3 0 1.5T 0 disk
# Find the corresponding disk/by-path link
$ ls -l /dev/disk/by-path/ | grep nvme
lrwxrwxrwx. 1 root root 13 Aug 21 17:34 pci-0000:b1:00.0-nvme-1 -> ../../nvme0n1
lrwxrwxrwx. 1 root root 13 Aug 21 17:34 pci-0000:b2:00.0-nvme-1 -> ../../nvme1n1
lrwxrwxrwx. 1 root root 13 Aug 21 17:34 pci-0000:b3:00.0-nvme-1 -> ../../nvme2n1
lrwxrwxrwx. 1 root root 13 Aug 21 17:34 pci-0000:b4:00.0-nvme-1 -> ../../nvme3n1
Add these values to your extra vars section in the all.yml
file.
In this case we are installing on all NVMe drives, and will have configured our hosts for UEFI boot.
ansible/vars/all.yml
################################################################################
# Extra vars
################################################################################
# Install disks
# sno_install_disk: /dev/disk/by-path/...
control_plane_install_disk: /dev/disk/by-path/pci-0000:b1:00.0-nvme-1
worker_install_disk: /dev/disk/by-path/pci-0000:b1:00.0-nvme-1
# Control plane etcd deployed on NVMe
controlplane_nvme_device: /dev/disk/by-path/pci-0000:b2:00.0-nvme-1
controlplane_etcd_on_nvme: true
Note: The values seen in /dev/disk/by-path
may differ between RHEL8 and RHEL9.
If your OpenShift version is based on RHEL9 (4.13+), you should install RHEL9 on the nodes
first to ensure the paths are correct.
eg: /dev/sda
- Seen on Supermicro 1029U
RHEL8:
lrwxrwxrwx. 1 root root 9 Feb 5 19:22 pci-0000:00:11.5-ata-1 -> ../../sda
RHEL9:
lrwxrwxrwx. 1 root root 9 Feb 5 19:22 pci-0000:00:11.5-ata-1 -> ../../sda
lrwxrwxrwx. 1 root root 9 Feb 5 19:22 pci-0000:00:11.5-ata-1.0 -> ../../sda <---- Use this one
Note: For bare-metal deployment of OCP 4.13 or greater it is advisable to set the extra vars for by-path reference for the installation. Below are the extra vars along with the hardware used.
Hardware | control_plane_install_disk | worker_install_disk |
---|---|---|
Dell r650 | /dev/disk/by-path/pci-0000:67:00.0-scsi-0:2:0:0 | /dev/disk/by-path/pci-0000:67:00.0-scsi-0:2:0:0 |
Dell r640 | /dev/disk/by-path/pci-0000:18:00.0-scsi-0:2:0:0 | /dev/disk/by-path/pci-0000:18:00.0-scsi-0:2:0:0 |
To find your machine's by-path reference use the following command and choose the install disk. (Note, this
assumes that the bastion hardware configuration is identical: in a heterogeneous cluster you may need to
execute this command on each host in your deployment, setting the control_plane_install_disk
and
worker_install_disk
paths manually for each host in the inventory file.)
(.ansible) [root@<bastion> jetlag]# ls -la /dev/disk/by-path/
total 0
drwxr-xr-x. 2 root root 160 Apr 11 19:40 .
drwxr-xr-x. 6 root root 120 Apr 11 19:40 ..
lrwxrwxrwx. 1 root root 9 Apr 11 19:40 pci-0000:18:00.0-scsi-0:2:0:0 -> ../../sda
lrwxrwxrwx. 1 root root 10 Apr 11 19:40 pci-0000:18:00.0-scsi-0:2:0:0-part1 -> ../../sda1
lrwxrwxrwx. 1 root root 10 Apr 11 19:40 pci-0000:18:00.0-scsi-0:2:0:0-part2 -> ../../sda2
lrwxrwxrwx. 1 root root 9 Apr 11 19:40 pci-0000:18:00.0-scsi-0:2:1:0 -> ../../sdb
lrwxrwxrwx. 1 root root 9 Apr 11 19:40 pci-0000:18:00.0-scsi-0:2:2:0 -> ../../sdc
lrwxrwxrwx. 1 root root 13 Apr 11 19:40 pci-0000:d8:00.0-nvme-1 -> ../../nvme0n1