breakwater is a very fast Pixelflut server written in Rust. It is heavily inspired by Shoreline.
It claims to be the fastest Pixelflut server in existence - at least at the time of writing 02/2022.
- Accepts Pixelflut commands
- It can start a native display window in your graphical environment
- Simultaneously it can provide a VNC server so that everybody can watch
- As an alternative it can stream to a RTMP sink, so that you can e.g. directly live-stream into Twitch or YouTube
- Exposes Prometheus metrics
- IPv6 and legacy IP support
Commands must be sent newline-separated, for more details see Pixelflut
HELP
: Prints a help text with the available commands.PX x y rrggbb
: PX x y rrggbb: Color the pixel (x,y) with the given hexadecimal color rrggbb, e.g.PX 10 10 ff0000
PX x y rrggbbaa
: Color the pixel (x,y) with the given hexadecimal color rrggbb (alpha channel is ignored for now), e.g.PX 10 10 ff0000ff
PX x y gg
: Color the pixel (x,y) with the hexadecimal color gggggg. Basically this is the same as the other commands, but is a more efficient way of filling white, black or gray areas, e.g.PX 10 10 00
to paint blackPX x y
: Get the color value of the pixel (x,y), e.g.PX 10 10
PBxxyyrgba
: Binary version of thePX
command.x
andy
are little-endian 16 bit coordinates,r
,g
,b
anda
are a byte each. There is no newline after the command. Tipp: For most use-cases this is the most efficient format with 10 bytes per pixel ;)PXMULTI<startX:16><startY:16><len:32><rgba 1 of (startX, startY)><rgba 2 of (startX + 1, startY)><rgba 3 of (startX + 1, startY)>...<rgba len>
: EXPERIMENTAL binary syncing of whole pixel areas. Please note that for performance reasons this will be copied 1:1 to the servers framebuffer. The server will just take the following bytes and memcpy it into the framebuffer, so the alpha channel doesn't matter and you might mess up the screen. This is intended for export-use, especially when syncing or combining multiple Pixelflut screens across multiple servers. Note: This command needs to be enabled using thebinary-sync-pixels
featureSIZE
: Get the size of the drawing surface, e.g.SIZE 1920 1080
OFFSET x y
: Apply offset (x,y) to all further pixel draws on this connection. This can e.g. be used to pre-calculate an image/animation and simply use the OFFSET command to move it around the screen without the need to re-calculate it, e.g.OFFSET 100 100
The easiest way is to continue with the provided Ready to use Docker setup below.
If you prefer the manual way (the best performance - as e.g. can use native SIMD instructions) you need to have Rust installed.
You may need to install some additional packages with sudo apt install pkg-config libvncserver-dev
Then you can directly run the server with
cargo run --release -- --vnc
The default settings should provide you with a ready-to-use server.
Port | Description |
---|---|
1234 | Pixelflut server |
5900 | VNC server |
9100 | Prometheus metrics exporter |
The get a list of options try
cargo run --release -- --help
Output
cargo run --release -- --help
Finished release [optimized] target(s) in 0.04s
Running `target/release/breakwater --help`
Usage: breakwater [OPTIONS]
Options:
-l, --listen-address <LISTEN_ADDRESS>
Listen address to bind to. The default value will listen on all interfaces for IPv4 and IPv6 packets [default: [::]:1234]
--width <WIDTH>
Width of the drawing surface [default: 1280]
--height <HEIGHT>
Height of the drawing surface [default: 720]
-f, --fps <FPS>
Frames per second the server should aim for [default: 30]
--network-buffer-size <NETWORK_BUFFER_SIZE>
The size in bytes of the network buffer used for each open TCP connection. Please use at least 64 KB (64_000 bytes) [default: 262144]
-t, --text <TEXT>
Text to display on the screen [default: "Pixelflut server (breakwater)"]
--font <FONT>
The font used to render the text on the screen. Should be a ttf file. If you use the default value a copy that ships with breakwater will be used - no need to download and provide the font [default: Arial.ttf]
-p, --prometheus-listen-address <PROMETHEUS_LISTEN_ADDRESS>
Listen address the prometheus exporter should listen on [default: [::]:9100]
--statistics-save-file <STATISTICS_SAVE_FILE>
Save file where statistics are periodically saved. The save file will be read during startup and statistics are restored. To reset the statistics simply remove the file [default: statistics.json]
--statistics-save-interval-s <STATISTICS_SAVE_INTERVAL_S>
Interval (in seconds) in which the statistics save file should be updated [default: 10]
--disable-statistics-save-file
Disable periodical saving of statistics into save file
--rtmp-address <RTMP_ADDRESS>
Enable rtmp streaming to configured address, e.g. `rtmp://127.0.0.1:1935/live/test`
--video-save-folder <VIDEO_SAVE_FOLDER>
Enable dump of video stream into file. File location will be `<VIDEO_SAVE_FOLDER>/pixelflut_dump_{timestamp}.mp4
-c, --connections-per-ip <CONNECTIONS_PER_IP>
Allow only a certain number of connections per ip address
--vnc
Enabled a VNC server
-v, --vnc-port <VNC_PORT>
Port of the VNC server [default: 5900]
--native-display
Enable native display output. This requires some form of graphical system (so will probably not work on your server)
-h, --help
Print help
-V, --version
Print version
You can also build the binary with cargo build --release
. The binary will be placed at target/release/breakwater
.
Breakwater also has some compile-time features for dependency or performance reasons. You can get the list of available features by looking at the Cargo.toml. As of writing the following features are supported:
native-display
(enabled by default): Starts a graphical window on your local system. Please note that this requires a graphical environment.vnc
(enabled by default): Starts a VNC server, where users can connect to. Needslibvncserver-dev
to be installed. Please note that the VNC server offers basically no latency, but consumes quite some CPU.alpha
(disabled by default): Respect alpha values duringPX
commands. Disabled by default as this can cause performance degradation.binary-set-pixel
(enabled by default): Allows use of thePB
command.binary-sync-pixels
(disabled by default): Allows use of thePXMULTI
command.
To e.g. turn the VNC server off, build with
cargo run --release --no-default-features # --features alpha,vnc to explicitly enable
Fabian Wunsch has introduced initial support for SIMD when parsing the hexadecimal color values in #5. Thanks! We might be able to extend the support, parsing the decimal coordinates or blending colors using alpha using SIMD would be awesome as well. PRs welcome!
As portable_simd
is a unstable feature, we configured this project to use nightly Rust in rust-toolchain.toml
. Once the feature is stable we can switch back to stable Rust.
If the SIMD or nightly part causes any problems on your setup please reach out by creating an Issue!
This command will start the Pixelflut server in a docker container
docker run --rm --init -t -p 1234:1234 -p 5900:5900 -p 9100:9100 sbernauer/breakwater # --help
If you want to permanently save statistics (to keep them between restarts) you can use the following command:
mkdir -p pixelflut && docker run --rm -u 1000:1000 --init -t -p 1234:1234 -p 5900:5900 -p 9100:9100 -v "$(pwd)/pixelflut:/pixelflut" sbernauer/breakwater --statistics-save-file /pixelflut/statistics.json
The ready to use Docker compose setup contains the Pixelflut server, a Prometheus server and a Grafana for monitoring. Use the following command to start the whole setup
cd docker && docker-compose up
You should now have access to the following services
Port | Description |
---|---|
1234 | Pixelflut server |
5900 | VNC server |
9100 | Prometheus metrics exporter |
9090 | Prometheus server |
3000 | Grafana |
If you visit the Grafana server (user=admin, password=admin) you will have access to dashboards like the dashboard below.
The docker-compose setup also contains an owncast server, which breakwater pushes an RTMP stream into. owncast than exposes a Web UI where people can watch the game in a web-browser. Please note that the stream has a much higer delay compared to VNC and the fmmpeg command used internally consumes much CPU. Because of this the components are commented out by default, you need to comment them in!
This should work the same way as streaming to owncast.
Simply uncomment the breakwater
command and adopt --rtmp-address
accordingly.
I never used this for a longer time period, so happy about feedback!
During high traffic at GPN22 a few warnings where raised as breakwater was not able to memadvise the created buffer to the kernel.
This should not happen, as we allocate the memory area directly before but who knows. The memadvise is only on a best-effort base, client connections while still be served even if the call fails. The worst thing that should happen is a minimal performance degradation. Have a look at the issue report for details.
My Laptop has a Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
(6 Cores/12 Threads) and 2 DDR4 RAM modules with 16 GB each and 2667 MT/s.
The Pixelflut-server and Pixelflut-client Sturmflut both run on my Laptop using 24 connections.
These are the results of different Pixelflut servers:
Server | Language | Traffic during first 30s | When thermal throttling |
---|---|---|---|
Pixelnuke | C | 1.1 Gbit/s | 1 Gbit/s |
Pixelwar | Java | 2.1 Gbit/s | 1.6 Gbit/s |
pixelpwnr-server | Rust | 6.3 Gbit/s | 4.6 Gbit/s |
Shoreline | C | 15 Gbit/s | 12 Gbit/s |
Breakwater | Rust | 30 Gbit/s | 22 Gbit/s |
As I don't have access to a dedicated server any more I did run the following benchmark on a Hetzner sever.
Server type: CCX62
with 48
dedicated AMD EPYC cores and 192
GB RAM.
Sturmflut was used as a client using the loopback interface.
The whole test setup did cost less than 1€ for one hour, so please feel free to submit performance numbers for other pixelflut servers or validate the results!
Server | Language | Sustainable traffic |
---|---|---|
Shoreline@05a2bbfb4559090727c51673e1fb47d20eac5672 | C | 55 Gbit/s |
Breakwater@4cc8e2a4c7fd03886ede3061d6359c8063665755 | Rust | 110 Gbit/s |
lscpu command output
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 40 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 48
On-line CPU(s) list: 0-47
Vendor ID: AuthenticAMD
Model name: AMD EPYC Processor
CPU family: 25
Model: 1
Thread(s) per core: 2
Core(s) per socket: 24
Socket(s): 1
Stepping: 1
BogoMIPS: 4792.80
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdts
cp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c
rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core invpcid_single ssbd ibrs ibpb
stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveer
ptr wbnoinvd arat umip pku ospke rdpid fsrm
Virtualization features:
Hypervisor vendor: KVM
Virtualization type: full
Caches (sum of all):
L1d: 768 KiB (24 instances)
L1i: 768 KiB (24 instances)
L2: 12 MiB (24 instances)
L3: 32 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-47
Vulnerabilities:
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Retbleed: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Srbds: Not affected
Tsx async abort: Not affected
As I got access to an server with a single Ampere(R) Altra(R) Processor with 80 cores, here are the results:
Sturmflut was used as a client using the loopback interface.
Server | Language | Sustainable traffic |
---|---|---|
Shoreline@05a2bbfb4559090727c51673e1fb47d20eac5672 | C | 185 Gbit/s |
Breakwater@135d2b795858a896a73470fd152407c22b0a0d26 | Rust | 420 Gbit/s |
lscpu command output
Architecture: aarch64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 80
On-line CPU(s) list: 0-79
Vendor ID: ARM
Model name: Neoverse-N1
Model: 1
Thread(s) per core: 1
Core(s) per socket: 80
Socket(s): 1
Stepping: r3p1
Frequency boost: disabled
CPU max MHz: 3000.0000
CPU min MHz: 1000.0000
BogoMIPS: 50.00
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
Caches (sum of all):
L1d: 5 MiB (80 instances)
L1i: 5 MiB (80 instances)
L2: 80 MiB (80 instances)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-79
Vulnerabilities:
Gather data sampling: Not affected
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Retbleed: Not affected
Spec rstack overflow: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1: Mitigation; __user pointer sanitization
Spectre v2: Mitigation; CSV2, BHB
Srbds: Not affected
Tsx async abort: Not affected