- Overview
- Supported Software and Hardware Configuration
- General Architecture and Components
- Best Practices
JARVICE supports running 3D applications using GPU acceleration on servers or cloud instances. This allows users to run complex rendering applications, such as pre and post processing tools, with good performance and compatibility despite being rendered remotely. This does not require any advanced hardware capabilities on clients themselves - simply a web browser or VNC client is needed to access, as with 2D applications. The user experience is also transparent and requires no additional setup once the initial administrator-level configuration is done.
At the time of this writing, the following configurations support 3D graphics acceleration in JARVICE:
- NVIDIA GPUs with EGL capabilities, capable of using the NVIDIA driver 390 or higher; 450 or higher is recommended.
- Non-NVIDIA GPU types are not currently supported but may be in the future.
- x86 Linux host
- NVIDIA GPU driver version 390 or higher; 450 or higher is recommended
- NVIDIA container runtime, latest version recommended; configured as default container runtime
- Working
nvidia-smi
command on the host, at least 1 GPU available
- Non-x86 platforms are not currently supported but may be in the future.
- AWS instance types capable of running the
amazon-eks-gpu-node-*
AMI for the Kubernetes version used for an EKS deployment are supported.
- To use this feature, pods must be allowed to mount (read-only) host paths into containers in the "jobs" namespace; JARVICE manages this securely and automatically, and said paths are only available to initContainers rather than the application itself, but this capability must not be disabled if this feature is to be used
- The JARVICE
jarvice-dri-optional-device-plugin
DaemonSet must be deployed, which is the default in values.yaml (controlled viadaemonsets.dri_optional.enabled
, which defaults totrue
) - The NVIDIA DaemonSet is recommended but not required; if the only GPU capabilities are for graphics rather than compute, it is not needed. See Best Practices for more details on node selection.
All graphical applications provided in the default JARVICE service catalog support this capability if enabled. Custom applications must use image-common and must inherit from a CentOS 7 (e.g. FROM centos:7
in Dockerfile
) or Ubuntu 18.04 (e.g. FROM ubuntu:bionic
in Dockerfile
) base image, which support the vendor-neutral GL libraries (libglvnd
).
This feature uses the EGL capabilities in a compute node's GPU to transparently offload OpenGL 3D rendering and convert it to 2D into a framebuffer. This framebuffer is then remote-displayed down to a browser or VNC client.
JARVICE automatically detects EGL capabilities on a compute node's GPU and configures the graphical desktop for hardware offload. Assuming the conditions and requirements described above are met, a user can validate this by opening a Terminal window in an application desktop and typing:
glxinfo -B
JARVICE leverages VirtualGL to enable this offload. No explicit configuration is needed, as ${LD_PRELOAD}
and ${VGL_DISPLAY}
are set automatically before the platform hands control to apps. JARVICE also plumbs the VirtualGL libraries and binaries themselves into the application container as part of initialization. These are unmodified binaries provided by the VirtualGL project itself for x86 platforms.
2D rendering (such as GUI widgets and controls) flows directly to the framebuffer via the X11 protocol and is not offloaded.
Please note that while offload to a local GPU can produce very high framerates for rendering (up to hundreds, or more, per second), actual performance will vary based on network latency and bandwidth between the client and the remote application. Because the framebuffer is rendered asynchronously per the VNC protocol, this may result in frames "skipping" when performing 3D-intensive tasks such as rotating models.
Applications supporting EGL natively can simply be fed the discovered device from the environment variable ${VGL_DISPLAY}
. 2D rendering can simply target ${DISPLAY}
for non-3D items such as GUI prompts, etc. For automating launch of existing EGL-based apps, a "wrapper script" is recommended to properly parameterize the application binary with the value of ${VGL_DISPLAY}
.
- To facilitate future functionality allowing jobs to mix GPU and non-GPU parallel workers, it is recommended that the NVIDIA Container Runtime not be deployed on non-GPU capable nodes.
- To maximize compatibility with HyperHub applications from the catalog, clusters should define
nc*
machine types for graphical rendering andng*
machine types for full GPU capabilities (graphics rendering + compute). For example, most simulation apps for x86 in the service catalog targetnc3
for pre and post processing GUIs. - The
egl
pseudo-device adds no explicit node selection when defined in the machine definition; JARVICE automatically requests GPU 0 from the container runtime as rendering only occurs on this GPU. If rendering targets are single GPU, and the NVIDIA DaemonSet is deployed, use a value of1
for GPU count in the machine definition. Otherwise, use a label selector to target nodes that have GPU capabilities. - It is possible to have multiple jobs rendering on GPU 0 using just the
egl
pseudo-device and setting the GPU count to0
in the machine definition; note however that this requires thenvidia-docker
runtime be the default container runtime, and may not work on all platforms. - If targetting machines with more than 1 GPU, the best practice is to set the GPU count to
1
in the machine definition. Note however that this will limit the number of concurrent visualization jobs on a given system to the number of GPUs available.