Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tcp_bpf: Fix the sk_mem_uncharge logic in tcp_bpf_sendmsg #7903

Open
wants to merge 2 commits into
base: bpf_base
Choose a base branch
from

Conversation

kernel-patches-daemon-bpf[bot]
Copy link

Pull request for series with
subject: tcp_bpf: Fix the sk_mem_uncharge logic in tcp_bpf_sendmsg
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=899984

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 2aa587f
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=899984
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 19039f2
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=899984
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 98cd619
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=899984
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: db123e4
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=899984
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 5ac9b4e
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=899984
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 5ac9b4e
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=899984
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 42f7652
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=899984
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: a552e2e
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=899984
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 740be3b
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=899984
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 2ce9abd
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=899984
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 1fb3158
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=899984
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 1fb3158
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=899984
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: d7f214a
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=899984
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: d0b98f6
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=899984
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 053b212
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=899984
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: c40dd8c
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=899984
version: 1

Zijian Zhang added 2 commits November 1, 2024 00:28
…n test_sockmap

Add this to more comprehensively test the socket memory accounting logic
in the __SK_REDIRECT and __SK_DROP cases of tcp_bpf_sendmsg. We don't have
test when apply_bytes are not zero in test_txmsg_redir_wait_sndmem.
test_send_large has opt->rate=2, it will invoke sendmsg two times.
Specifically, the first sendmsg will trigger the case where the ret value
of tcp_bpf_sendmsg_redir is less than 0; while the second sendmsg happens
after the 3 seconds timeout, and it will trigger __SK_DROP because socket
c2 has been removed from the sockmap/hash.

Signed-off-by: Zijian Zhang <[email protected]>
The current sk memory accounting logic in __SK_REDIRECT is pre-uncharging
tosend bytes, which is either msg->sg.size or a smaller value apply_bytes.
Potential problems with this strategy are as follows:
- If the actual sent bytes are smaller than tosend, we need to charge some
bytes back, as in line 487, which is okay but seems not clean.
- When tosend is set to apply_bytes, as in line 417, and (ret < 0), we may
miss uncharging (msg->sg.size - apply_bytes) bytes.

415 tosend = msg->sg.size;
416 if (psock->apply_bytes && psock->apply_bytes < tosend)
417   tosend = psock->apply_bytes;
...
443 sk_msg_return(sk, msg, tosend);
444 release_sock(sk);
446 origsize = msg->sg.size;
447 ret = tcp_bpf_sendmsg_redir(sk_redir, redir_ingress,
448                             msg, tosend, flags);
449 sent = origsize - msg->sg.size;
...
454 lock_sock(sk);
455 if (unlikely(ret < 0)) {
456   int free = sk_msg_free_nocharge(sk, msg);
458   if (!cork)
459     *copied -= free;
460 }
...
487 if (eval == __SK_REDIRECT)
488   sk_mem_charge(sk, tosend - sent);

When running the selftest test_txmsg_redir_wait_sndmem with txmsg_apply,
the following warning will be reported,
------------[ cut here ]------------
WARNING: CPU: 6 PID: 57 at net/ipv4/af_inet.c:156 inet_sock_destruct+0x190/0x1a0
Modules linked in:
CPU: 6 UID: 0 PID: 57 Comm: kworker/6:0 Not tainted 6.12.0-rc1.bm.1-amd64+ #43
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
Workqueue: events sk_psock_destroy
RIP: 0010:inet_sock_destruct+0x190/0x1a0
RSP: 0018:ffffad0a8021fe08 EFLAGS: 00010206
RAX: 0000000000000011 RBX: ffff9aab4475b900 RCX: ffff9aab481a0800
RDX: 0000000000000303 RSI: 0000000000000011 RDI: ffff9aab4475b900
RBP: ffff9aab4475b990 R08: 0000000000000000 R09: ffff9aab40050ec0
R10: 0000000000000000 R11: ffff9aae6fdb1d01 R12: ffff9aab49c60400
R13: ffff9aab49c60598 R14: ffff9aab49c60598 R15: dead000000000100
FS:  0000000000000000(0000) GS:ffff9aae6fd80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffec7e47bd8 CR3: 00000001a1a1c004 CR4: 0000000000770ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
? __warn+0x89/0x130
? inet_sock_destruct+0x190/0x1a0
? report_bug+0xfc/0x1e0
? handle_bug+0x5c/0xa0
? exc_invalid_op+0x17/0x70
? asm_exc_invalid_op+0x1a/0x20
? inet_sock_destruct+0x190/0x1a0
__sk_destruct+0x25/0x220
sk_psock_destroy+0x2b2/0x310
process_scheduled_works+0xa3/0x3e0
worker_thread+0x117/0x240
? __pfx_worker_thread+0x10/0x10
kthread+0xcf/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x31/0x40
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
---[ end trace 0000000000000000 ]---

In __SK_REDIRECT, a more concise way is delaying the uncharging after sent
bytes are finalized, and uncharge this value. When (ret < 0), we shall
invoke sk_msg_free.

Same thing happens in case __SK_DROP, when tosend is set to apply_bytes,
we may miss uncharging (msg->sg.size - apply_bytes) bytes. The same
warning will be reported in selftest.

468 case __SK_DROP:
469 default:
470 sk_msg_free_partial(sk, msg, tosend);
471 sk_msg_apply_bytes(psock, tosend);
472 *copied -= (tosend + delta);
473 return -EACCES;

So instead of sk_msg_free_partial we can do sk_msg_free here.

Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface")
Fixes: 8ec95b9 ("bpf, sockmap: Fix the sk->sk_forward_alloc warning of sk_stream_kill_queues")
Signed-off-by: Zijian Zhang <[email protected]>
@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 6c52d4d
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=899984
version: 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants