[StatefulSet/Parallel] Virtual Functions Issues When Creating/Destroying Pods in Parallel. Switching devices #668

midnattsol · 2024-03-27T11:28:41Z

Environment

I'm trying to create a statefulset assigning 2 VF associated with 2 PF Infiniband interfaces per pod.
I'm creating 8 pods per server.
Every server has 16 infiniband interfaces, and I create 1VF per interface.

Problem Description
When I create the statefulset in parallel, or when it terminates (all the pods terminates at the same time), randomly some interfaces switch the PCI where it points.

So when I check the the device associated in the host they are totally messed

# cat /sys/class/infiniband_verbs/uverbs{19,20,21,22,23,24}/ibdev
mlx5_24
mlx5_23
mlx5_21
mlx5_22
mlx5_20
mlx5_19

The uverbs19 is pointing to the interface 24
The uverbs20 is pointing to the interface 23
The uverbs23 is pointing to the interface 20
The uverbs24 is pointing to the interface 19

So the pods cannot recognize the mlx interfaces to use them with UCX.

Workaround so far
Creating the cluster sequentially and scaling to 0 before terminating the statefulset helps, because the race condition is not triggered, but I guess is not the expected behaviour.

The text was updated successfully, but these errors were encountered:

midnattsol · 2024-03-27T11:29:32Z

I've seen this PR that modifies the behaviour for the switchdev
#643

Could this potentially help with the problem?

SchSeba · 2024-03-31T14:23:52Z

so there is a global lock in the ib-sriov-cni that should prevent this one

@e0ne @ykulazhenkov is this something you will be able to take a look?

adrianchiris · 2024-05-01T11:11:54Z

Hi,

can you provide the SriovIbNetwork you defined as well as the SriovPolicy ?

is the problem only that RDMA device changes (i.e mlx5_19 gets recreated/renamed to mlx5_24 after pod was deleted) ?
when the new pod starts does it have the correct mounts ? and UCX is unable to cope with RDMA device mlx5_24 having ULPs with different index (e.g uverbs19) ?

SchSeba · 2024-08-19T13:53:28Z

Hi @midnattsol any update on this issue or we can close it?

midnattsol · 2024-08-19T15:55:32Z

Hello @SchSeba I've been reviewing it, and I think the issue can come for another side, because there're no rules for the VF in the device manager, so in some situation, this is triggered. Does it make sense? If this is the case I think it could be closed.

SchSeba · 2024-08-20T20:48:45Z

I am sorry but I didn't follow your last comment can you please elaborate?
do you still see the issue? there is something we can help?

midnattsol · 2024-09-23T15:11:50Z

Hello @SchSeba ,
Let me try to provide more context.

We are using the network-operator from NVIDIA, which leverages the sriov-network-device-plugin from the k8snetworkplumbingwg project.

What we are observing is that when using SR-IOV, the Linux udev service removes the virtual functions from the host and exposes them in the pods. Once these pods are destroyed, the host's device manager adds the virtual functions back.

The issue seems to occur during these resource movements. What I meant in my previous comment is that I don’t see any kind of udev rule generated for the virtual functions on the host to maintain consistent naming. I assume that, for some reason, the kernel might assign a new name to those resources if the udev rules are not created beforehand, leading to inconsistent behavior. However, I'm not entirely sure and it's something I still need to test.

Does that make sense?

midnattsol changed the title ~~[StatefulSet/Parallel] Virtual Functions Compatibility Issues When Creating/Destroying Pods in Parallel~~ [StatefulSet/Parallel] Virtual Functions Issues When Creating/Destroying Pods in Parallel. Switching devices Mar 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[StatefulSet/Parallel] Virtual Functions Issues When Creating/Destroying Pods in Parallel. Switching devices #668

[StatefulSet/Parallel] Virtual Functions Issues When Creating/Destroying Pods in Parallel. Switching devices #668

midnattsol commented Mar 27, 2024

midnattsol commented Mar 27, 2024

SchSeba commented Mar 31, 2024

adrianchiris commented May 1, 2024

SchSeba commented Aug 19, 2024

midnattsol commented Aug 19, 2024

SchSeba commented Aug 20, 2024

midnattsol commented Sep 23, 2024 •

edited

Loading

[StatefulSet/Parallel] Virtual Functions Issues When Creating/Destroying Pods in Parallel. Switching devices #668

[StatefulSet/Parallel] Virtual Functions Issues When Creating/Destroying Pods in Parallel. Switching devices #668

Comments

midnattsol commented Mar 27, 2024

midnattsol commented Mar 27, 2024

SchSeba commented Mar 31, 2024

adrianchiris commented May 1, 2024

SchSeba commented Aug 19, 2024

midnattsol commented Aug 19, 2024

SchSeba commented Aug 20, 2024

midnattsol commented Sep 23, 2024 • edited Loading

midnattsol commented Sep 23, 2024 •

edited

Loading