Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement zero copy pipeline #156

Draft
wants to merge 8 commits into
base: stable
Choose a base branch
from
Draft

Implement zero copy pipeline #156

wants to merge 8 commits into from

Conversation

ABeltramo
Copy link
Member

@ABeltramo
Copy link
Member Author

With the latest commit it should be possible to test out the zero copy pipeline ON NVIDIA ONLY by using the following Docker image:

ghcr.io/games-on-whales/wolf:dev-zero-copy

and by setting the following env variable:

WOLF_USE_ZERO_COPY=1

@alibell
Copy link

alibell commented Jan 14, 2025

Hi there,

Testing it on my computer, I got the following error on steam docker container :

00:00:01.456 [ERROR] [wlr] [xwayland/xwm.c:1627] xcb error: op 18:0, code 3, sequence 158, value 4194306
00:00:01.456 [ERROR] [wlr] [xwayland/xwm.c:1627] xcb error: op 19:0, code 3, sequence 159, value 4194306
00:00:01.456 [ERROR] [wlr] [xwayland/xwm.c:1627] xcb error: op 19:0, code 3, sequence 161, value 4194306
00:00:05.030 [ERROR] [wlr] [xwayland/xwm.c:1627] xcb error: op 12:0, code 2, sequence 356, value 0
00:00:15.242 [ERROR] [wlr] [xwayland/xwm.c:1163] Failed to get window property
00:00:15.242 [ERROR] [wlr] [xwayland/xwm.c:1163] Failed to get window property
00:00:15.242 [ERROR] [wlr] [xwayland/xwm.c:1163] Failed to get window property
00:00:15.242 [ERROR] [wlr] [xwayland/xwm.c:1163] Failed to get window property
00:00:15.242 [ERROR] [wlr] [xwayland/xwm.c:1163] Failed to get window property
00:00:15.242 [ERROR] [wlr] [xwayland/xwm.c:1627] xcb error: op 18:0, code 3, sequence 420, value 6291459
00:00:15.242 [ERROR] [wlr] [xwayland/xwm.c:1627] xcb error: op 19:0, code 3, sequence 421, value 6291459

My configuration: Arch, NVIDIA 3090, Ryzen 9700X

@ABeltramo
Copy link
Member Author

Hi, thanks for testing it out! Can you please post the full Wolf logs from the beginning?

@alibell
Copy link

alibell commented Jan 18, 2025

Hi, thanks for testing it out! Can you please post the full Wolf logs from the beginning?

I perform the following operation:

  • Run Wolf
  • Connect with moonlight destop
  • Run steam

On stable image

docker-compose.yaml

services:
  wolf:
    image: ghcr.io/games-on-whales/wolf:stable
    environment:
      - XDG_RUNTIME_DIR=/tmp/sockets
      - HOST_APPS_STATE_FOLDER=/home/.wolf
      - NVIDIA_DRIVER_CAPABILITIES=all
      - NVIDIA_VISIBLE_DEVICES=all
      - WOLF_RENDER_NODE=/dev/dri/renderD128
      - WOLF_ENCODER_NODE=/dev/dri/renderD128
    volumes:
      - /etc/wolf/:/etc/wolf
      - /home/ali/.wolf/:/home/.wolf
      - /tmp/sockets:/tmp/sockets:rw
      - /var/run/docker.sock:/var/run/docker.sock:rw
      - /dev/:/dev/:rw
      - /run/udev:/run/udev:rw
    device_cgroup_rules:
      - 'c 13:* rmw'
    devices:
      - /dev/dri
      - /dev/uinput
      - /dev/uhid
    runtime: nvidia
    deploy:
      resources:
        reservations:
          devices:
            - capabilities: [gpu]
    network_mode: host
    restart: unless-stopped

On zero copy image

docker-compose.yaml

services:
  wolf:
    image: ghcr.io/games-on-whales/wolf:dev-zero-copy
    environment:
      - XDG_RUNTIME_DIR=/tmp/sockets
      - HOST_APPS_STATE_FOLDER=/home/.wolf
      - NVIDIA_DRIVER_CAPABILITIES=all
      - NVIDIA_VISIBLE_DEVICES=all
      - WOLF_RENDER_NODE=/dev/dri/renderD128
      - WOLF_ENCODER_NODE=/dev/dri/renderD128
      - WOLF_USE_ZERO_COPY=1
    volumes:
      - /etc/wolf/:/etc/wolf
      - /home/ali/.wolf/:/home/.wolf
      - /tmp/sockets:/tmp/sockets:rw
      - /var/run/docker.sock:/var/run/docker.sock:rw
      - /dev/:/dev/:rw
      - /run/udev:/run/udev:rw
    device_cgroup_rules:
      - 'c 13:* rmw'
    devices:
      - /dev/dri
      - /dev/uinput
      - /dev/uhid
    runtime: nvidia
    deploy:
      resources:
        reservations:
          devices:
            - capabilities: [gpu]
    network_mode: host
    restart: unless-stopped

In both case WOLF_USE_ZERO_COPY=1 and WOLF_USE_ZERO_COPY=0 Steam doesn't launch.
In dmesg I have the following error :

[  435.535759] docker0: port 2(veth334535a) entered blocking state
[  435.535765] docker0: port 2(veth334535a) entered forwarding state
[  560.792973] sway[8644]: segfault at 582ae153a ip 0000582ab6926aac sp 00007ffee4ac69e0 error 4 in sway[23aac,582ab6914000+65000] likely on CPU 11 (core 3, socket 0)
[  560.792982] Code: 06 00 89 3d a6 fe 06 00 48 63 50 04 85 d2 0f 8e 8c 00 00 00 48 8b 40 08 48 8d 0c d0 eb 09 48 83 c0 08 48 39 c8 74 79 48 8b 10 <f6> 42 1c 40 74 ee 4c 8d 05 d1 37 05 00 b9 81 01 00 00 31 c0 bf 03

@ABeltramo
Copy link
Member Author

Very interesting, thanks for sharing!
I just need a couple more info about your HW: which GPU do you have, and do you have more than one plugged? Nvidia driver version?

Lastly, if you could give it an extra go but using the nvidia driver volume (the Nvidia manual installation method from the quickstart guide), that would be very helpful.

Thanks for testing!

@alibell
Copy link

alibell commented Jan 18, 2025

My graphic card: NVIDIA 3090 GTX
I have only one graphic card. I also have a discrete gpu with my AMD cpu.

nvidia-smi

Sat Jan 18 21:01:02 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.77                 Driver Version: 565.77         CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3090        Off |   00000000:01:00.0  On |                  N/A |
| 30%   32C    P0            112W /  350W |     737MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      1621      G   /usr/bin/gnome-shell                          407MiB |
|    0   N/A  N/A      2369      G   /usr/lib/firefox/firefox                      243MiB |
|    0   N/A  N/A      3256    C+G   /usr/bin/kgx                                   24MiB |
|    0   N/A  N/A      3274      G   /usr/bin/Xwayland                               6MiB |

@alibell
Copy link

alibell commented Jan 18, 2025

Tried with manual NVIDIA install, it works like a charm.

I notice some difference with stable branch:

  • I still have performance which are under 'native' performances. I could compare with stable branch but it seems way better
    • My benchmark is gow 2018
      • On my computer without wolf: I have a stable 120 FPS
      • On wolf I'm around 90-95 FPS, still playable tho.
    • If you are interested I could compare different scenario : stable with manual nvidia install, stable with container toolkit, zero-copy with with manual nvidia install
  • On stable with container toolkit: I couln't use DLSS on game, It now works, I guess this is due to usage of manual driver

My logs:

docker-compose

services:
  wolf:
    image: ghcr.io/games-on-whales/wolf:dev-zero-copy
    environment:
      - XDG_RUNTIME_DIR=/tmp/sockets
      - HOST_APPS_STATE_FOLDER=/home/.wolf
      - NVIDIA_DRIVER_VOLUME_NAME=nvidia-driver-vol
      - WOLF_RENDER_NODE=/dev/dri/renderD128
      - WOLF_ENCODER_NODE=/dev/dri/renderD128
      - WOLF_USE_ZERO_COPY=0
    volumes:
      - /etc/wolf/:/etc/wolf
      - /home/ali/.wolf/:/home/.wolf
      - /tmp/sockets:/tmp/sockets:rw
      - /var/run/docker.sock:/var/run/docker.sock:rw
      - /dev/:/dev/:rw
      - /run/udev:/run/udev:rw
      - nvidia-driver-vol:/usr/nvidia:rw
    device_cgroup_rules:
      - 'c 13:* rmw'
    devices:
      - /dev/dri
      - /dev/uinput
      - /dev/uhid
      - /dev/nvidia-uvm
      - /dev/nvidia-uvm-tools
      - /dev/nvidia-caps/nvidia-cap1
      - /dev/nvidia-caps/nvidia-cap2
      - /dev/nvidiactl
      - /dev/nvidia0
      - /dev/nvidia-modeset
    network_mode: host
    restart: unless-stopped
volumes:
  nvidia-driver-vol:
    external: true

docker volume ls | grep nvidia-driver

local     nvidia-driver-vol

wolf log : https://pastebin.com/iH7Zjws0
steam container log : https://pastebin.com/2xmH0K0q

@alibell
Copy link

alibell commented Jan 18, 2025

Just tried on stable with manual nvidia driver installation :

  • I have the same performance than zero copy on gow 2018
  • I don't see any difference on CPU workload

But actually deployment with container-toolkit was really buggy : no DLSS, poor performances in comparison with nvidia manual install

@ABeltramo
Copy link
Member Author

Thanks for all the feedback, that's really appreciated!

It's still very early days, so it's hard to tell why you don't see an improvement. At the very least, you should see a lower usage in GPU by Wolf (not sure that will automatically turn into more in-game FPS). There are still things that could be optimised, and I'll definitely try to address those before merging this.

First, I have to port the pipeline for Intel and AMD, though..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants