This is the open-source repository for our paper λ-IO: A Unified IO Stack for Computational Storage on FAST'23.
@inproceedings {fast23-lambda-io,
author = {Zhe Yang and Youyou Lu and Xiaojian Liao and Youmin Chen and Junru Li and Siyu He and Jiwu Shu},
title = {{λ-IO}: A Unified {IO} Stack for Computational Storage},
booktitle = {21st USENIX Conference on File and Storage Technologies (FAST 23)},
year = {2023},
isbn = {978-1-939133-32-8},
address = {Santa Clara, CA},
pages = {347--362},
url = {https://www.usenix.org/conference/fast23/presentation/yang-zhe},
publisher = {USENIX Association},
month = feb,
}λ-IO includes a large number of components.
- Host userspace: Spark SQL, HDFS, libbpf, ...
- Host kernel: VFS, eBPF verifier & JITer, ...
- Device: FPGA block design, firmware (NVMe + λ extension), ...
We are constructing the repository step by step to present it in a better way. We plan to open source some propotype code in the future. Please focus on the repository for updates.
Here is a brief introduction to uploaded components.
device-hardware(device). PL(FPGA) and PS(ARM) code and binary files for Daisy/DaisyPlus OpenSSD.ebpf(host/device): eBPF/sBPF computational functions.host-eval(host): Evaluation code. It shows how to use λ-IO in an application, including the programming model and APIs.kmod(host): the kernel module of λ-IO dispatcher and helper syscalls.libbpf(host/device): The library to load an eBPF/sBPF program.linux-lio(host): Modified files in the Linux source code tree. We tried to implement components as a kernel module (kmod), but some still need to be implemented in the Linux source tree code and compiled into the kernel image.nvme-ctl(device): The firmware of the NVMe controller on the ARM CPUs. Before running the controller, you need to configure the PL(FPGA) and the Linux on PS(ARM) usingdevice-hardware.uebpf-rt(device): Userspace eBPF/sBPF runtime for x86-64 and aarch64 (ARM). λ-IO only usesuebpf-rton the device. We also port the runtime to the x86-64 userspace, for the research purpose.
The repository serves as more a reference for experts, rather than an out-of-the-box codebase for beginners. It is not straightforward to build and run the code directly, because the code is for specific software and hardware environment.
For example, the NVMe controller (nvme-ctl) is the userspace code for PetaLinux on Daisy/DaisyPlus OpenSSD. To run the code, one must configure the PetaLinux along with an FPGA image on Daisy/DaisyPlus OpenSSD.
If you have further questions and interests, please feel free to propose an issue or contact me via email yangzhe.ac AT outlook.com) for detailed steps. You can find my github at yzim.
Running λ-IO kernel does not need a specific hardware platform. You can do it in the following steps.
- Compile the kernel source code and boot with the kernel.
Download the Linux source code and modify it using
linux-lio.linux-liois based on Ubuntu 20.04 LTS, Linux 5.10.21 (official apt package by Ubuntu). We recommend you to try the same OS and kernel. If your OS and kernel differ from our settings, it needs extra efforts to fix the compatibility. - Compile the kernel module
lio-helperinkmodand insmod it. To run λ-IO in the kernel mode only, you can delete all the code of functionsdo_write_auto,do_write_dev,do_read_auto,do_auto, anddo_devinproc.c, along with the variables. - Compile
libbpfto a library. - Write and compile your computational logic with files in
ebpf/src/kernas examples. Therefore, you can get an eBPF program*.oto load. - Write and compile your application with files in
host-eval/src/eval-one.ccas an example. You can search the code for the mentioned functions below.- load the eBPF program using
kebpf_load_file(...). - open the
lio-helperbylioh_fd = open("/proc/lio-helper"). - set the loaded eBPF program using
ioctl(lioh_fd, LIOH_IOCTL_SET_KBPF, ...). - set the scheduling mode to kernel-only using
ioctl(lioh_fd, LIOH_IOCTL_SET_SCHED, K). - set the backend file using
set_back_file(lioh_fd, ...). - trigger λ-read/λ-write using
pread(lioh_fd, ...)orpwrite(lioh_fd, ...).
- load the eBPF program using
We provide the compiling Makefile or CMakeLists.txt in kmod, libbpf, ebpf, and host-eval.
We also provide a root CMakelists.txt in the root directory.
I recommend you to use the root CMakelists.txt because it compiles all the components and fixes issues such as linking the compiled library.
To compile them for the host, set option ONHOST to ON in all the CMakeLists.txt files (OFF to cross-compile for the device-side ARM).
If what you need is just an NVMe controller without λ-IO components, you can try device-hardware and nvme-ctl.
We provide a usable NVMe controller, which is ported and refactored from OpenExpress for Daisy/DaisyPlus OpenSSD.
It is much more efficient than the official one according to our experiments.