Workload Interference Detector uses a combination of hardware events and ebpf to capture a wholistic signature of a workload's performance at very low overhead.
- instruction efficiency
- cycles
- instructions
- cycles per instruction
- disk IO
- local bandwidth (MB/s)
- remote bandwidth (MB/s)
- disk reads (MB/s)
- disk writes (MB/s)
- network IO
- network transmitted (MB/s)
- network received (MB/s)
- cache
- L1 instrutions misses per instruction
- L1 data hit ratio
- L1 data miss ratio
- L2 miss ratio
- L3 miss ratio
- scheduling
- scheduled count
- average queue length
- average queue latency (ms)
- Linux Perf
- BCC compiled from source.
pip install -r requirements.txt
- Access to PMU
- Bare-metal
- VM with vPMU exposed (uncore metrics like disk IO will be zero)
- Intel Xeon chip
- Skylake
- Cascade Lake
- Ice Lake
- Sapphire Rapids
- Python
- Monitor processes
sudo python3 procmon.py
- Monitor containers (can also export to cloudwatch)
sudo python3 cmon.py
- Detect process or container interference. A list of workloads that likely caused the performance degradation is shown.
# process
sudo python3 NN_detect.py --pid <process-pid> --ref_signature <processes's reference signature> --distance_ratio 0.15
# container
sudo python3 NN_detect.py --cid <container id> --ref_signature <container's reference signature> --distance_ratio 0.15
** Interference Detector was developed using the following as references:
- github.com/iovisor/bcc/tools/llcstat.py (Apache 2.0)
- github.com/iovisor/bcc/tools/tcptop.py (Apache 2.0)
- github.com/iovisor/bcc/blob/master/examples/tracing/disksnoop.py (Apache 2.0)
- github.com/iovisor/bcc/blob/master/tools/runqlen.py (Apache 2.0)
- github.com/iovisor/bcc/blob/master/tools/runqlat.py (Apache 2.0)