Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are you support ConnectX 6 network interface card?Why the resources Capacity and Allocatabel values is 0 at k8s cluster? #97

Open
sober-wang opened this issue Jan 19, 2024 · 3 comments

Comments

@sober-wang
Copy link

Are you support ConnectX 6 network interface card?

@e0ne
Copy link
Collaborator

e0ne commented Jan 19, 2024

Are you support ConnectX 6 network interface card?

Hi @sober-wang . ConnectX 6 is a supported NIC

@sober-wang
Copy link
Author

sober-wang commented Jan 22, 2024

Are you support ConnectX 6 network interface card?

Hi @sober-wang . ConnectX 6 is a supported NIC

But my k8s resource description Capacity and Allocatable is 0 values.

image


my os: ubuntu 20.04
kubernetes version: 1.23
my kubelet --root-dir: /data/kubelet

the plugin configuration.
image
and workload.
image

root@gpu-11:~# ibdev2netdev 
mlx5_0 port 1 ==> ens12f0np0 (Down)
mlx5_1 port 1 ==> ens12f1np1 (Down)
mlx5_2 port 1 ==> ens24np0 (Up)
mlx5_3 port 1 ==> ens25np0 (Up)
mlx5_4 port 1 ==> bondYW (Up)
mlx5_5 port 1 ==> ens17f1np1 (Down)
mlx5_6 port 1 ==> bondYW (Up)
mlx5_7 port 1 ==> ens18f1np1 (Down)
mlx5_8 port 1 ==> ens30np0 (Up)
mlx5_9 port 1 ==> ens31np0 (Up)
root@gpu-11:~# mst status -v 
MST modules:
------------
    MST PCI module is not loaded
    MST PCI configuration module loaded
PCI devices:
------------
DEVICE_TYPE             MST                           PCI       RDMA            NET                       NUMA  
ConnectX6(rev:0)        /dev/mst/mt4123_pciconf3      df:00.0   mlx5_9          net-ens31np0              1     

ConnectX6(rev:0)        /dev/mst/mt4123_pciconf2      a0:00.0   mlx5_8          net-ens30np0              1     

ConnectX6(rev:0)        /dev/mst/mt4123_pciconf1      72:00.0   mlx5_3          net-ens25np0              0     

ConnectX6(rev:0)        /dev/mst/mt4123_pciconf0      58:00.0   mlx5_2          net-ens24np0              0     

ConnectX4LX(rev:0)      /dev/mst/mt4117_pciconf2.1    83:00.1   mlx5_7          net-ens18f1np1            1     

ConnectX4LX(rev:0)      /dev/mst/mt4117_pciconf2      83:00.0   mlx5_6          net-bondYW                1     

ConnectX4LX(rev:0)      /dev/mst/mt4117_pciconf1.1    82:00.1   mlx5_5          net-ens17f1np1            1     

ConnectX4LX(rev:0)      /dev/mst/mt4117_pciconf1      82:00.0   mlx5_4          net-bondYW                1     

ConnectX4LX(rev:0)      /dev/mst/mt4117_pciconf0.1    18:00.1   mlx5_1          net-ens12f1np1            0     

ConnectX4LX(rev:0)      /dev/mst/mt4117_pciconf0      18:00.0   mlx5_0          net-ens12f0np0            0  

@sober-wang sober-wang changed the title Are you support ConnectX 6 network interface card? Are you support ConnectX 6 network interface card?Why the resources Capacity and Allocatabel values is 0 at k8s cluster? Jan 22, 2024
@tdg5
Copy link

tdg5 commented May 15, 2024

@sober-wang , I think this might relate to your use of a custom root-dir for kubelet. If you're using the nvidia/mellanox network-operator, it hardcodes the volume mounts for the pod that runs this service to the standard kubelet root path.

The kubernetes manifest in this repository is guilty of the same.

Not really related, but same idea: kubernetes/kubernetes#120626

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants