Welcome! vSwarm is a collection of ready-to-run serverless benchmarks, each typically consisting of a number of interconnected serverless functions, and with a general focus on realistic data-intensive workloads.
This suite is part of the vHive Ecosystem. Its a turnkey and fully tested solution meant to used in conjunction with vHive, and is compatible with all technologies that it supports, namely, containers, Firecracker and gVisor microVMs. The majority of benchmarks support distributed tracing with Zipkin which traces both the infra components and the user functions.
In addition to the multi-function benchmarks, the vSwarm suite contains a set of standalone functions, which support both x86 and arm64 architectures. Most of the standalone functions are compatible with vSwarm-u, which allows to run them in the gem5 cycle-accurate full-system CPU simulator and study microarchitectural implications of serverless computing. the state-of-the-art research platform for system-and microarchitecture. The standalone functions can therefore be used as microbenchmarks to first pin-point microarchitectural bottlenecks in execution of serverless workloads using Top-Down analysis (tool) on real hardware and then further explore and optimize these bottlenecks using the gem5 cycle-accurate simulator.
benchmarks
contains all of the available benchmark source code and manifests.utils
contains utilities for use within serverless functions, e.g. the tracing module.tools
is for command-line tools and services useful outside of serverless functions, such as deployment or invokation.runner
is for setting up self-hosted GitHub Actions runners.docs
contains additional documentation on a number of relevant topics.
- 2 microbenchmarks for benchmarking chained functions performance, data transfer performance in various patterns (pipeline, scatter, gather), and using different communication medium (AWS S3 and inline transfers)
- 8 real-world benchmarks
- MapReduce: Corral (golang), and an aws-reference python implementation of Aggregation Query from the representative AMPLab Big Data Benchmark 1node dataset.
- Real-time video analytics (Python and Golang): recognizes objects in a video fragment
- ML models training: stacking ensemble training and iterative hyperparameter tuning
- ExCamera video decoding (gg): decoding of a video in parallel
- distributed compilation (gg): compiles LLVM in parallel
- fibonacci (gg): classic recursive implementation to find
n
th number in the sequence by calculatingn-1
andn-2
in parallel
- 25 standalone functions
- AES, Auth, Fibonacci: Same functionality implemented in the three different runtimes: Python, NodeJS, Golang.
- Online shop: 9 functions implemented in various runtimes, ported from Googles Online Boutique
- Hotel reservation: 7 microservices from DeathStarBenchs Hotel Reservation Application ported as standalone serverless microbenchmarks.
Refer to this document for more detail on the differences and supported features of each benchmark.
Details on each specific benchmark can be found in their relative subfolders. Every benchmark can
be run on a knative cluster, and most can also be run locally with docker-compose
. Please see the
running benchmarks document for detailed instructions on how to
run a benchmark locally or on a cluster.
We have a detailed outline on the benchmarking methodology used, which you can find here.
We openly welcome any contributions, so please get in touch if you're interested!
Bringing up a benchmark typically consists of dockerizing the benchmark functions to deploy and test them with docker-compose, then integrating the functions with knative, and including the benchmark in the CI/CD pipeline. Please refer to our documentation on bringing up new benchmarks for more guidance.
We also have some basic requirements for contributions the the repository, which are described in detail in our Contributing to vHive document.
vSwarm is free. We publish the code under the terms of the MIT License that allows distribution, modification, and commercial use. This software, however, comes without any warranty or liability.
The software is maintained by the vHive Ecosystem, EASE lab the University of Edinburgh, Stanford Systems and Networking Research.
- Invoker, timeseriesdb, runners - Dmitrii Ustiugov: GitHub, twitter, web page
- Corral Benchmark - Dmitrii Ustiugov: GitHub, twitter, web page
- Map-Reduce, Stacking-training and Tuning-halving - Alan Nair GitHub
- Chained Functions and Video analytics - Shyam Jesalpura GitHub
- GG benchmarks - Shyam Jesalpura GitHub
- Standalone functions - David Schall GitHub,web page
- Multi cloud containers (lambda) - Alan Nair GitHub