Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to find the Infiniband/RoCE device when using the Ubuntu 18.04 image #106

Open
thincal opened this issue Apr 25, 2024 · 6 comments
Open

Comments

@thincal
Copy link

thincal commented Apr 25, 2024

Created a container with the ubuntu 18.04 image and using the rdma-shared device plugin, inside the container when running ib_write_bw it reports bellow error, but with ubuntu 20.04/22.04 it works well. so what's the reason behind causing this issue ? appreciated for any information.

Did not detect devices 
If device exists, check if driver is up
Unable to find the Infiniband/RoCE device
@thincal
Copy link
Author

thincal commented Apr 30, 2024

@adrianchiris Hi, do you have any info with this issue ? thanks.

@adrianchiris
Copy link
Collaborator

adrianchiris commented May 1, 2024

i believe its related to the perftest version being used in the workload container vs the RDMA api exposed by the kernel running on the node.

what is the OS of the k8s worker node ?

i dont think its related to rdma shared device plugin

@thincal
Copy link
Author

thincal commented May 2, 2024

what is the OS of the k8s worker node ?

Ubuntu 22.04.2 LTS

@adrianchiris
Copy link
Collaborator

i believe its related to the perftest version being used in the workload container vs the RDMA api exposed by the kernel running on the node.

so that is the reason.

@adrianchiris
Copy link
Collaborator

is this issue still relevant ?

@Hugh-yw
Copy link

Hugh-yw commented Dec 11, 2024

这个问题还相关吗?

请用楼上解决了吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants