Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understanding of how OpenGL libraries are linked to an application within a container #689

Open
s-bonnet opened this issue Sep 10, 2024 · 0 comments
Assignees
Labels
question Categorizes issue or PR as a support question.

Comments

@s-bonnet
Copy link

s-bonnet commented Sep 10, 2024

Hello,

This is not a bug, just a question to understand what I am doing.

I use the GPU operator to deploy OpenGL applications in a Kubernetes Cluster. It works fine, my pod (a basic glxgears) starts and when running nvidia-smi from the nvidia-driver pod I can see that glxgears is using the GPU (and the metrics are consistent with the usage of hardware acceleration).

However, when looking within the container some things are unclear.
When I inspect the content of the container on my PC, ie without the GPU Operator, I can see that the installation of glx-utils package comes with some default openGL libraries (with mesa implementation) :

[root@9417bac3962a lib64]# ls -al /usr/lib64/ |grep libGL
lrwxrwxrwx  1 root root      14 Nov 11  2022 libGL.so.1 -> libGL.so.1.7.0
-rwxr-xr-x  1 root root  558944 Nov 11  2022 libGL.so.1.7.0
lrwxrwxrwx  1 root root      15 Nov 11  2022 libGLX.so.0 -> libGLX.so.0.0.0
-rwxr-xr-x  1 root root  141256 Nov 11  2022 libGLX.so.0.0.0
lrwxrwxrwx  1 root root      20 Nov 11  2022 libGLX_mesa.so.0 -> libGLX_mesa.so.0.0.0
-rwxr-xr-x  1 root root  502032 Nov 11  2022 libGLX_mesa.so.0.0.0
lrwxrwxrwx  1 root root      27 Nov 11  2022 libGLX_system.so.0 -> /usr/lib64/libGLX_mesa.so.0
lrwxrwxrwx  1 root root      22 Nov 11  2022 libGLdispatch.so.0 -> libGLdispatch.so.0.0.0
-rwxr-xr-x  1 root root  769048 Nov 11  2022 libGLdispatch.so.0.0.0

When running the same command within the container deployed in my cluster with the GPU Operator i have the following result :

[root@glxgears-glxgears-deployment-694bc49445-87kzr lib64]# ls -al /usr/lib64/ | grep libGL
lrwxrwxrwx.  1 root root       14 Nov 11  2022 libGL.so.1 -> libGL.so.1.7.0
-rwxr-xr-x.  1 root root   558944 Nov 11  2022 libGL.so.1.7.0
lrwxrwxrwx.  1 root root       33 Sep  9 15:40 libGLESv1_CM_nvidia.so.1 -> libGLESv1_CM_nvidia.so.550.107.02
-rwxr-xr-x.  1 root root    68000 Sep  6 13:36 libGLESv1_CM_nvidia.so.550.107.02
lrwxrwxrwx.  1 root root       30 Sep  9 15:40 libGLESv2_nvidia.so.2 -> libGLESv2_nvidia.so.550.107.02
-rwxr-xr-x.  1 root root   117144 Sep  6 13:36 libGLESv2_nvidia.so.550.107.02
lrwxrwxrwx.  1 root root       15 Nov 11  2022 libGLX.so.0 -> libGLX.so.0.0.0
-rwxr-xr-x.  1 root root   141256 Nov 11  2022 libGLX.so.0.0.0
lrwxrwxrwx.  1 root root       27 Sep  9 15:40 libGLX_indirect.so.0 -> libGLX_nvidia.so.550.107.02
lrwxrwxrwx.  1 root root       20 Nov 11  2022 libGLX_mesa.so.0 -> libGLX_mesa.so.0.0.0
-rwxr-xr-x.  1 root root   502032 Nov 11  2022 libGLX_mesa.so.0.0.0
lrwxrwxrwx.  1 root root       27 Sep  9 15:40 libGLX_nvidia.so.0 -> libGLX_nvidia.so.550.107.02
-rwxr-xr-x.  1 root root  1203776 Sep  6 13:36 libGLX_nvidia.so.550.107.02
lrwxrwxrwx.  1 root root       27 Nov 11  2022 libGLX_system.so.0 -> /usr/lib64/libGLX_mesa.so.0
lrwxrwxrwx.  1 root root       22 Nov 11  2022 libGLdispatch.so.0 -> libGLdispatch.so.0.0.0
-rwxr-xr-x.  1 root root   769048 Nov 11  2022 libGLdispatch.so.0.0.0

First observations :

  • some new libraries are present (I guess mounted by the nvidia-container-toolkit)
  • already present libraries are unchanged (size is identical)

If now I have a look to libraries present in the same container but in the volume shared with the host where libraries are installed :

[root@glxgears-glxgears-deployment-694bc49445-87kzr lib64]# ls -al /run/driver/lib/x86_64-linux-gnu/  | grep libGL
lrwxrwxrwx. 1 root root       10 Sep  6 13:36 libGL.so -> libGL.so.1
lrwxrwxrwx. 1 root root       14 Sep  6 13:36 libGL.so.1 -> libGL.so.1.7.0
-rwxr-xr-x. 1 root root   649416 Sep  6 13:36 libGL.so.1.7.0
lrwxrwxrwx. 1 root root       17 Sep  6 13:36 libGLESv1_CM.so -> libGLESv1_CM.so.1
lrwxrwxrwx. 1 root root       21 Sep  6 13:36 libGLESv1_CM.so.1 -> libGLESv1_CM.so.1.2.0
-rwxr-xr-x. 1 root root    43208 Sep  6 13:36 libGLESv1_CM.so.1.2.0
lrwxrwxrwx. 1 root root       33 Sep  6 13:36 libGLESv1_CM_nvidia.so.1 -> libGLESv1_CM_nvidia.so.550.107.02
-rwxr-xr-x. 1 root root    68000 Sep  6 13:36 libGLESv1_CM_nvidia.so.550.107.02
lrwxrwxrwx. 1 root root       14 Sep  6 13:36 libGLESv2.so -> libGLESv2.so.2
lrwxrwxrwx. 1 root root       18 Sep  6 13:36 libGLESv2.so.2 -> libGLESv2.so.2.1.0
-rwxr-xr-x. 1 root root    80064 Sep  6 13:36 libGLESv2.so.2.1.0
lrwxrwxrwx. 1 root root       30 Sep  6 13:36 libGLESv2_nvidia.so.2 -> libGLESv2_nvidia.so.550.107.02
-rwxr-xr-x. 1 root root   117144 Sep  6 13:36 libGLESv2_nvidia.so.550.107.02
lrwxrwxrwx. 1 root root       11 Sep  6 13:36 libGLX.so -> libGLX.so.0
-rwxr-xr-x. 1 root root   137616 Sep  6 13:36 libGLX.so.0
lrwxrwxrwx. 1 root root       27 Sep  6 13:36 libGLX_nvidia.so.0 -> libGLX_nvidia.so.550.107.02
-rwxr-xr-x. 1 root root  1203776 Sep  6 13:36 libGLX_nvidia.so.550.107.02
-rwxr-xr-x. 1 root root   952576 Sep  6 13:36 libGLdispatch.so.0

We can notice that for the following libraries :

  • libGL
  • libGLX
  • libGLdispatch
    The version present in /usr/lib64 is the one initially installed in the container and not the one mounted by the nvidia stack.

When looking to an extract of the links of the application :

[root@glxgears-glxgears-deployment-694bc49445-87kzr lib64]# ldd /usr/bin/glxgears
        libGL.so.1 => /usr/lib64/libGL.so.1 (0x00007fa1142e7000)
        libX11.so.6 => /usr/lib64/libX11.so.6 (0x00007fa113c22000)
        libGLX.so.0 => /usr/lib64/libGLX.so.0 (0x00007fa11362b000)
        libGLdispatch.so.0 => /usr/lib64/libGLdispatch.so.0 (0x00007fa113162000)

We can also see that the application is not linked with the libraries built with my version of the driver.

The configuration of my X-Server is pretty the same as it is started in another pod and I noticed exactly the same thing.

So my question is pretty basic : how can this work ? It seems that my application loads the mesa version of the libGL, libGLX & libGLdispatch, however the display is well rendered by the GPU. Am I missing something ? It would be great if I can find deep documentation of these mechanisms.

If necessary, I'm using the following versions :

  • GPU Operator : 23.6.1
  • Container toolkit : 1.13.4-ubuntu20.04

Thanks!

Regards,

@elezar elezar self-assigned this Sep 23, 2024
@elezar elezar added the question Categorizes issue or PR as a support question. label Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests

2 participants