The cpuPressure
field generates CPU load on the targeted pod.
Containers achieve resource limitation (cpu, disk, memory) through cgroups. cgroups have the directory format /sys/fs/cgroup/<kind>/<name>/
, and we can add a process to a cgroup by appending its PID
to the cgroups.procs
or the tasks
files (depending on the use case). Docker containers get their own cgroup as illustrated by PID 1873
below:
📖 More information on how cgroups work here.
The /sys/fs/cgroup
directory of the host must be mounted in the injector pod at the /mnt/cgroup
path for it to work.
When the injector pod starts:
- It parses the
cpuset.cpus
file (located in the targetcpuset
cgroup) to retrieve cores allocated to the target processes. - It creates one goroutine per allocated core. Each goroutine is locked on the thread they are running on. By doing so, it forces the Go runtime scheduler to create one thread per locked goroutine.
- Each goroutine joins the target
cpu
andcpuset
cgroups.- Joining the
cpuset
cgroup is important to both have the same number of allocated cores as the target as well as the same allocated cores so we ensure that the goroutines threads will be scheduled on the same cores as the target processes
- Joining the
- Each goroutine renices itself to the highest priority (
-20
) so the Linux scheduler will always give it the priority to consume CPU time over other running processes - Each goroutine starts an infinite loop to consume as much CPU as possible
In a CPU disruption, the injector pid is moved to the target CPU cgroup but the injector container keeps its own pid namespace. Commands like top
or htop
won't see the process running just because they can't see the pid, although it uses the cgroup CPU.
This can be confirmed using top linux command.
- If you run
top
from within the targeted pod, you won't see the CPU usage increasing nor the injector process running. - If you run
top
from within the injector pod, you'll see the CPU usage increasing even though it is not consuming this container CPU resource. - If you run
top
from the node where the pod is running, you will also see the injector process eating CPU.
This because those tools are mostly relying on processes they can see to display resource usage. This can also be confirmed with benchmarking tools such as sysbench
running on the different containers.
Example sysbench
without the CPU pressure applied:
root@demo-curl-8589cffd98-ccjqg:/# sysbench --test=cpu run
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Prime numbers limit: 10000
Initializing worker threads...
Threads started!
CPU speed:
events per second: 1177.67
General statistics:
total time: 10.0004s
total number of events: 11780
Latency (ms):
min: 0.70
avg: 0.85
max: 18.00
95th percentile: 1.10
sum: 9975.80
Threads fairness:
events (avg/stddev): 11780.0000/0.00
execution time (avg/stddev): 9.9758/0.00
Example sysbench
with the CPU pressure applied:
root@demo-curl-8589cffd98-ccjqg:/# sysbench --test=cpu run
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Prime numbers limit: 10000
Initializing worker threads...
Threads started!
CPU speed:
events per second: 115.48
General statistics:
total time: 10.5973s
total number of events: 1224
Latency (ms):
min: 0.72
avg: 8.65
max: 906.92
95th percentile: 74.46
sum: 10592.69
Threads fairness:
events (avg/stddev): 1224.0000/0.00
execution time (avg/stddev): 10.5927/0.00