-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tcp_bpf: Fix the sk_mem_uncharge logic in tcp_bpf_sendmsg #7903
base: bpf_base
Are you sure you want to change the base?
Conversation
Upstream branch: 2aa587f |
086ca61
to
0e95e1e
Compare
Upstream branch: 19039f2 |
e482815
to
b6190a9
Compare
0e95e1e
to
2a4bf0e
Compare
Upstream branch: 98cd619 |
b6190a9
to
35642bb
Compare
2a4bf0e
to
b3e861b
Compare
Upstream branch: db123e4 |
35642bb
to
1ef903d
Compare
b3e861b
to
5dede3a
Compare
Upstream branch: 5ac9b4e |
1ef903d
to
1c695bb
Compare
5dede3a
to
6d81e01
Compare
Upstream branch: 5ac9b4e |
1c695bb
to
cce4d4e
Compare
6d81e01
to
fb2cb10
Compare
Upstream branch: 42f7652 |
cce4d4e
to
b672aa4
Compare
fb2cb10
to
fb92faf
Compare
Upstream branch: a552e2e |
b672aa4
to
e5aff05
Compare
aa7ea9e
to
36504f8
Compare
e17c729
to
914ffab
Compare
Upstream branch: 740be3b |
36504f8
to
b7dba07
Compare
914ffab
to
5b50b7b
Compare
Upstream branch: 2ce9abd |
b7dba07
to
ba5492c
Compare
5b50b7b
to
3f232da
Compare
Upstream branch: 1fb3158 |
ba5492c
to
12cb737
Compare
3f232da
to
2d2d8c7
Compare
Upstream branch: 1fb3158 |
12cb737
to
41d606d
Compare
2d2d8c7
to
3a0bb56
Compare
Upstream branch: d7f214a |
41d606d
to
6a2bc45
Compare
3a0bb56
to
e95f8ee
Compare
Upstream branch: d0b98f6 |
6a2bc45
to
6a981c0
Compare
e95f8ee
to
ba80bc5
Compare
Upstream branch: 053b212 |
6a981c0
to
0f6219c
Compare
ba80bc5
to
7098520
Compare
Upstream branch: c40dd8c |
0f6219c
to
da8e63f
Compare
7098520
to
7efdac4
Compare
…n test_sockmap Add this to more comprehensively test the socket memory accounting logic in the __SK_REDIRECT and __SK_DROP cases of tcp_bpf_sendmsg. We don't have test when apply_bytes are not zero in test_txmsg_redir_wait_sndmem. test_send_large has opt->rate=2, it will invoke sendmsg two times. Specifically, the first sendmsg will trigger the case where the ret value of tcp_bpf_sendmsg_redir is less than 0; while the second sendmsg happens after the 3 seconds timeout, and it will trigger __SK_DROP because socket c2 has been removed from the sockmap/hash. Signed-off-by: Zijian Zhang <[email protected]>
The current sk memory accounting logic in __SK_REDIRECT is pre-uncharging tosend bytes, which is either msg->sg.size or a smaller value apply_bytes. Potential problems with this strategy are as follows: - If the actual sent bytes are smaller than tosend, we need to charge some bytes back, as in line 487, which is okay but seems not clean. - When tosend is set to apply_bytes, as in line 417, and (ret < 0), we may miss uncharging (msg->sg.size - apply_bytes) bytes. 415 tosend = msg->sg.size; 416 if (psock->apply_bytes && psock->apply_bytes < tosend) 417 tosend = psock->apply_bytes; ... 443 sk_msg_return(sk, msg, tosend); 444 release_sock(sk); 446 origsize = msg->sg.size; 447 ret = tcp_bpf_sendmsg_redir(sk_redir, redir_ingress, 448 msg, tosend, flags); 449 sent = origsize - msg->sg.size; ... 454 lock_sock(sk); 455 if (unlikely(ret < 0)) { 456 int free = sk_msg_free_nocharge(sk, msg); 458 if (!cork) 459 *copied -= free; 460 } ... 487 if (eval == __SK_REDIRECT) 488 sk_mem_charge(sk, tosend - sent); When running the selftest test_txmsg_redir_wait_sndmem with txmsg_apply, the following warning will be reported, ------------[ cut here ]------------ WARNING: CPU: 6 PID: 57 at net/ipv4/af_inet.c:156 inet_sock_destruct+0x190/0x1a0 Modules linked in: CPU: 6 UID: 0 PID: 57 Comm: kworker/6:0 Not tainted 6.12.0-rc1.bm.1-amd64+ #43 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 Workqueue: events sk_psock_destroy RIP: 0010:inet_sock_destruct+0x190/0x1a0 RSP: 0018:ffffad0a8021fe08 EFLAGS: 00010206 RAX: 0000000000000011 RBX: ffff9aab4475b900 RCX: ffff9aab481a0800 RDX: 0000000000000303 RSI: 0000000000000011 RDI: ffff9aab4475b900 RBP: ffff9aab4475b990 R08: 0000000000000000 R09: ffff9aab40050ec0 R10: 0000000000000000 R11: ffff9aae6fdb1d01 R12: ffff9aab49c60400 R13: ffff9aab49c60598 R14: ffff9aab49c60598 R15: dead000000000100 FS: 0000000000000000(0000) GS:ffff9aae6fd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffec7e47bd8 CR3: 00000001a1a1c004 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ? __warn+0x89/0x130 ? inet_sock_destruct+0x190/0x1a0 ? report_bug+0xfc/0x1e0 ? handle_bug+0x5c/0xa0 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? inet_sock_destruct+0x190/0x1a0 __sk_destruct+0x25/0x220 sk_psock_destroy+0x2b2/0x310 process_scheduled_works+0xa3/0x3e0 worker_thread+0x117/0x240 ? __pfx_worker_thread+0x10/0x10 kthread+0xcf/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x31/0x40 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> ---[ end trace 0000000000000000 ]--- In __SK_REDIRECT, a more concise way is delaying the uncharging after sent bytes are finalized, and uncharge this value. When (ret < 0), we shall invoke sk_msg_free. Same thing happens in case __SK_DROP, when tosend is set to apply_bytes, we may miss uncharging (msg->sg.size - apply_bytes) bytes. The same warning will be reported in selftest. 468 case __SK_DROP: 469 default: 470 sk_msg_free_partial(sk, msg, tosend); 471 sk_msg_apply_bytes(psock, tosend); 472 *copied -= (tosend + delta); 473 return -EACCES; So instead of sk_msg_free_partial we can do sk_msg_free here. Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Fixes: 8ec95b9 ("bpf, sockmap: Fix the sk->sk_forward_alloc warning of sk_stream_kill_queues") Signed-off-by: Zijian Zhang <[email protected]>
Upstream branch: 6c52d4d |
da8e63f
to
0dace27
Compare
Pull request for series with
subject: tcp_bpf: Fix the sk_mem_uncharge logic in tcp_bpf_sendmsg
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=899984