-
Notifications
You must be signed in to change notification settings - Fork 148
Add overwrite mode for bpf ring buffer #9404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Upstream branch: f3af62b |
ccc6c10
to
052554f
Compare
Upstream branch: 911c035 |
3f516bc
to
8a5a030
Compare
052554f
to
9008a3e
Compare
Upstream branch: 15a3b79 |
8a5a030
to
8a6abc9
Compare
9008a3e
to
8270ae7
Compare
Upstream branch: fa47913 |
8a6abc9
to
3d3a508
Compare
8270ae7
to
2a0e5cd
Compare
Upstream branch: fa47913 |
3d3a508
to
90f9108
Compare
2a0e5cd
to
2562e0e
Compare
Upstream branch: 9e293d4 |
90f9108
to
0593543
Compare
2562e0e
to
5583eb6
Compare
Upstream branch: 3e2b799 |
0593543
to
8b309e1
Compare
5583eb6
to
601ea2d
Compare
Upstream branch: c93c59b |
8b309e1
to
cbb13c3
Compare
601ea2d
to
7390c2c
Compare
Upstream branch: bf0c2a8 |
cbb13c3
to
aa3494f
Compare
7390c2c
to
3a736b5
Compare
Upstream branch: 2caa6b8 |
aa3494f
to
c492ebe
Compare
3a736b5
to
c210f22
Compare
Upstream branch: 0786654 |
c492ebe
to
03ebc7b
Compare
c210f22
to
2530e45
Compare
When the bpf ring buffer is full, new events can not be recorded util the consumer consumes some events to free space. This may cause critical events to be discarded, such as in fault diagnostic, where recent events are more critical than older ones. So add ovewrite mode for bpf ring buffer. In this mode, the new event overwrites the oldest event when the buffer is full. The scheme is as follows: 1. producer_pos tracks the next position to write new data. When there is enough free space, producer simply moves producer_pos forward to make space for the new event. 2. To avoid waiting for consumer to free space when the buffer is full, a new variable overwrite_pos is introduced for producer. overwrite_pos tracks the next event to be overwritten (the oldest event committed) in the buffer. producer moves it forward to discard the oldest events when the buffer is full. 3. pending_pos tracks the oldest event under committing. producer ensures producers_pos never passes pending_pos when making space for new events. So multiple producers never write to the same position at the same time. 4. producer wakes up consumer every half a round ahead to give it a chance to retrieve data. However, for an overwrite-mode ring buffer, users typically only cares about the ring buffer snapshot before a fault occurs. In this case, the producer should commit data with BPF_RB_NO_WAKEUP flag to avoid unnecessary wakeups. The performance data for overwrite mode will be provided in a follow-up patch that adds overwrite mode benchs. A sample of performance data for non-overwrite mode on an x86_64 and arm64 CPU, before and after this patch, is shown below. As we can see, no obvious performance regression occurs. - x86_64 (AMD EPYC 9654) Before: Ringbuf, multi-producer contention ================================== rb-libbpf nr_prod 1 13.218 ± 0.039M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 2 15.684 ± 0.015M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 3 7.771 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 4 6.281 ± 0.001M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 8 2.842 ± 0.003M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 12 2.001 ± 0.004M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 16 1.833 ± 0.003M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 20 1.508 ± 0.003M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 24 1.421 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 28 1.309 ± 0.001M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 32 1.265 ± 0.003M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 36 1.198 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 40 1.174 ± 0.001M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 44 1.113 ± 0.003M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 48 1.097 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 52 1.070 ± 0.002M/s (drops 0.000 ± 0.000M/s) After: Ringbuf, multi-producer contention ================================== rb-libbpf nr_prod 1 13.751 ± 0.673M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 2 15.592 ± 0.008M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 3 7.776 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 4 6.463 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 8 2.883 ± 0.003M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 12 2.017 ± 0.003M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 16 1.816 ± 0.004M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 20 1.512 ± 0.003M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 24 1.396 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 28 1.303 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 32 1.267 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 36 1.210 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 40 1.181 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 44 1.136 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 48 1.090 ± 0.001M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 52 1.091 ± 0.002M/s (drops 0.000 ± 0.000M/s) - arm64 (HiSilicon Kunpeng 920) Before: Ringbuf, multi-producer contention ================================== rb-libbpf nr_prod 1 11.602 ± 0.423M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 2 9.599 ± 0.007M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 3 6.669 ± 0.008M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 4 4.806 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 8 3.856 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 12 3.368 ± 0.003M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 16 3.210 ± 0.007M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 20 3.003 ± 0.007M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 24 2.944 ± 0.007M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 28 2.863 ± 0.008M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 32 2.819 ± 0.007M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 36 2.887 ± 0.008M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 40 2.837 ± 0.008M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 44 2.787 ± 0.012M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 48 2.738 ± 0.010M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 52 2.700 ± 0.007M/s (drops 0.000 ± 0.000M/s) After: Ringbuf, multi-producer contention ================================== rb-libbpf nr_prod 1 11.614 ± 0.268M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 2 9.917 ± 0.007M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 3 6.920 ± 0.008M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 4 4.803 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 8 3.898 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 12 3.426 ± 0.008M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 16 3.320 ± 0.008M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 20 3.029 ± 0.013M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 24 3.068 ± 0.012M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 28 2.890 ± 0.009M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 32 2.950 ± 0.012M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 36 2.812 ± 0.006M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 40 2.834 ± 0.009M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 44 2.803 ± 0.010M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 48 2.766 ± 0.010M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 52 2.754 ± 0.009M/s (drops 0.000 ± 0.000M/s) Signed-off-by: Xu Kuohai <[email protected]>
In overwrite mode, the producer does not wait for the consumer, so the consumer is responsible for handling conflicts. An optimistic method is used to resolve the conflicts: the consumer first reads consumer_pos, producer_pos and overwrite_pos, then calculates a read window and copies data in the window from the ring buffer. After copying, it checks the positions to decide if the data in the copy window have been overwritten by be the producer. If so, it discards the copy and tries again. Once success, the consumer processes the events in the copy. Signed-off-by: Xu Kuohai <[email protected]>
Add test for overwiret mode ring buffer. Signed-off-by: Xu Kuohai <[email protected]>
Add overwrite mode bench for ring buffer. For reference, below are bench numbers collected from x86_64 and arm64. - x86_64 (AMD EPYC 9654) Ringbuf, multi-producer contention, overwrite mode ================================================== rb-libbpf nr_prod 1 14.970 ± 0.012M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 2 14.064 ± 0.007M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 3 7.493 ± 0.003M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 4 6.575 ± 0.001M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 8 3.696 ± 0.011M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 12 2.612 ± 0.012M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 16 2.335 ± 0.005M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 20 2.079 ± 0.005M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 24 1.965 ± 0.004M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 28 1.846 ± 0.004M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 32 1.790 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 36 1.735 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 40 1.701 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 44 1.669 ± 0.001M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 48 1.749 ± 0.001M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 52 1.709 ± 0.001M/s (drops 0.000 ± 0.000M/s) - arm64 (HiSilicon Kunpeng 920) Ringbuf, multi-producer contention, overwrite mode ================================================== rb-libbpf nr_prod 1 10.319 ± 0.231M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 2 9.219 ± 0.006M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 3 6.699 ± 0.013M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 4 4.608 ± 0.001M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 8 3.905 ± 0.001M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 12 3.282 ± 0.004M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 16 3.182 ± 0.008M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 20 3.029 ± 0.006M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 24 3.116 ± 0.004M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 28 2.869 ± 0.005M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 32 3.075 ± 0.010M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 36 2.795 ± 0.003M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 40 2.947 ± 0.005M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 44 2.748 ± 0.006M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 48 2.767 ± 0.003M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 52 2.858 ± 0.002M/s (drops 0.000 ± 0.000M/s) Signed-off-by: Xu Kuohai <[email protected]>
Upstream branch: dc0fe95 |
03ebc7b
to
cf02723
Compare
At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=988002 expired. Closing PR. |
Pull request for series with
subject: Add overwrite mode for bpf ring buffer
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=988002