Skip to content

Commit

Permalink
e2e: add irdma to module_blacklist kernel args
Browse files Browse the repository at this point in the history
This implements a workaround to prevent CI failures on specific hardware using an Intel E810 network card.
When UserLevelNetworking is set to True, tuned attempts to set the combined channel count equal to the reserved CPUs but fails with the following error:
tuned.utils.commands: Executing 'ethtool -L ens2f0 combined 1' error: netlink error: Device or resource busy
The error occurs because the ice driver: ens2f0: Cannot change channels when RDMA is active.
This issue causes the tuned profile to degrade.
As a temporary solution, by adding 'module_blacklist=irdma' to the kernel Args we will block RDMA, to avoid these errors.
Reference: OCPBUGS-46426

Signed-off-by: Ronny Baturov <[email protected]>
  • Loading branch information
rbaturov committed Dec 15, 2024
1 parent da3ed75 commit 905b130
Showing 1 changed file with 13 additions and 0 deletions.
13 changes: 13 additions & 0 deletions test/e2e/performanceprofile/functests/0_config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,18 @@ func testProfile() (*performancev2.PerformanceProfile, error) {

hugePagesSize := performancev2.HugePageSize("1G")

// This implements a workaround to prevent CI failures on specific hardware using an Intel E810 network card.
// When UserLevelNetworking is set to True, tuned attempts to set the combined channel count equal to the reserved CPUs but fails with the following error:
// tuned.utils.commands: Executing 'ethtool -L ens2f0 combined 1' error: netlink error: Device or resource busy
// The error occurs because the ice driver: ens2f0: Cannot change channels when RDMA is active.
// This issue causes the tuned profile to degrade.
// As a temporary solution, by adding 'module_blacklist=irdma' to the kernel Args we will block RDMA, to avoid these errors.
// Reference: OCPBUGS-46426

additionalKernelArgs := []string{
"module_blacklist=irdma",
}

profile := &performancev2.PerformanceProfile{
TypeMeta: metav1.TypeMeta{
Kind: "PerformanceProfile",
Expand Down Expand Up @@ -193,6 +205,7 @@ func testProfile() (*performancev2.PerformanceProfile, error) {
HighPowerConsumption: pointer.Bool(false),
PerPodPowerManagement: pointer.Bool(false),
},
AdditionalKernelArgs: additionalKernelArgs,
},
}
// If the machineConfigPool is master, the automatic selector from PAO won't work
Expand Down

0 comments on commit 905b130

Please sign in to comment.