Skip to content

Latest commit

 

History

History
74 lines (46 loc) · 6.35 KB

3D.md

File metadata and controls

74 lines (46 loc) · 6.35 KB

Accelerated 3D Remote Display Capabilities

Overview

JARVICE supports running 3D applications using GPU acceleration on servers or cloud instances. This allows users to run complex rendering applications, such as pre and post processing tools, with good performance and compatibility despite being rendered remotely. This does not require any advanced hardware capabilities on clients themselves - simply a web browser or VNC client is needed to access, as with 2D applications. The user experience is also transparent and requires no additional setup once the initial administrator-level configuration is done.

Supported Software and Hardware Configuration

At the time of this writing, the following configurations support 3D graphics acceleration in JARVICE:

Hardware

  • NVIDIA GPUs with EGL capabilities, capable of using the NVIDIA driver 390 or higher; 450 or higher is recommended.

Notes

  1. Non-NVIDIA GPU types are not currently supported but may be in the future.

Compute node/instance

  • x86 Linux host
  • NVIDIA GPU driver version 390 or higher; 450 or higher is recommended
  • NVIDIA container runtime, latest version recommended; configured as default container runtime
  • Working nvidia-smi command on the host, at least 1 GPU available

Notes

  1. Non-x86 platforms are not currently supported but may be in the future.
  2. AWS instance types capable of running the amazon-eks-gpu-node-* AMI for the Kubernetes version used for an EKS deployment are supported.

Kubernetes

  • To use this feature, pods must be allowed to mount (read-only) host paths into containers in the "jobs" namespace; JARVICE manages this securely and automatically, and said paths are only available to initContainers rather than the application itself, but this capability must not be disabled if this feature is to be used
  • The JARVICE jarvice-dri-optional-device-plugin DaemonSet must be deployed, which is the default in values.yaml (controlled via daemonsets.dri_optional.enabled, which defaults to true)
  • The NVIDIA DaemonSet is recommended but not required; if the only GPU capabilities are for graphics rather than compute, it is not needed. See Best Practices for more details on node selection.

Applications

All graphical applications provided in the default JARVICE service catalog support this capability if enabled. Custom applications must use image-common and must inherit from a CentOS 7 (e.g. FROM centos:7 in Dockerfile) or Ubuntu 18.04 (e.g. FROM ubuntu:bionic in Dockerfile) base image, which support the vendor-neutral GL libraries (libglvnd).

General Architecture and Components

OpenGL Applications

This feature uses the EGL capabilities in a compute node's GPU to transparently offload OpenGL 3D rendering and convert it to 2D into a framebuffer. This framebuffer is then remote-displayed down to a browser or VNC client.

Hardware-accelerated EGL-based transparent 3D offload

JARVICE automatically detects EGL capabilities on a compute node's GPU and configures the graphical desktop for hardware offload. Assuming the conditions and requirements described above are met, a user can validate this by opening a Terminal window in an application desktop and typing:

glxinfo -B

JARVICE leverages VirtualGL to enable this offload. No explicit configuration is needed, as ${LD_PRELOAD} and ${VGL_DISPLAY} are set automatically before the platform hands control to apps. JARVICE also plumbs the VirtualGL libraries and binaries themselves into the application container as part of initialization. These are unmodified binaries provided by the VirtualGL project itself for x86 platforms.

2D rendering (such as GUI widgets and controls) flows directly to the framebuffer via the X11 protocol and is not offloaded.

Please note that while offload to a local GPU can produce very high framerates for rendering (up to hundreds, or more, per second), actual performance will vary based on network latency and bandwidth between the client and the remote application. Because the framebuffer is rendered asynchronously per the VNC protocol, this may result in frames "skipping" when performing 3D-intensive tasks such as rotating models.

Native EGL applications

Applications supporting EGL natively can simply be fed the discovered device from the environment variable ${VGL_DISPLAY}. 2D rendering can simply target ${DISPLAY} for non-3D items such as GUI prompts, etc. For automating launch of existing EGL-based apps, a "wrapper script" is recommended to properly parameterize the application binary with the value of ${VGL_DISPLAY}.

Best Practices

  • To facilitate future functionality allowing jobs to mix GPU and non-GPU parallel workers, it is recommended that the NVIDIA Container Runtime not be deployed on non-GPU capable nodes.
  • To maximize compatibility with HyperHub applications from the catalog, clusters should define nc* machine types for graphical rendering and ng* machine types for full GPU capabilities (graphics rendering + compute). For example, most simulation apps for x86 in the service catalog target nc3 for pre and post processing GUIs.
  • The egl pseudo-device adds no explicit node selection when defined in the machine definition; JARVICE automatically requests GPU 0 from the container runtime as rendering only occurs on this GPU. If rendering targets are single GPU, and the NVIDIA DaemonSet is deployed, use a value of 1 for GPU count in the machine definition. Otherwise, use a label selector to target nodes that have GPU capabilities.
  • It is possible to have multiple jobs rendering on GPU 0 using just the egl pseudo-device and setting the GPU count to 0 in the machine definition; note however that this requires the nvidia-docker runtime be the default container runtime, and may not work on all platforms.
  • If targetting machines with more than 1 GPU, the best practice is to set the GPU count to 1 in the machine definition. Note however that this will limit the number of concurrent visualization jobs on a given system to the number of GPUs available.