Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net-timestamp: bpf extension to equip applications transparently #4926

Open
wants to merge 12 commits into
base: bpf-next_base
Choose a base branch
from

Conversation

kernel-patches-daemon-bpf-rc[bot]
Copy link

Pull request for series with
subject: net-timestamp: bpf extension to equip applications transparently
version: 7
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=928755

@kernel-patches-daemon-bpf-rc
Copy link
Author

Upstream branch: 0fc5ddd
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=928755
version: 7

@kernel-patches-daemon-bpf-rc
Copy link
Author

Upstream branch: c03320a
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=928755
version: 7

@kernel-patches-daemon-bpf-rc
Copy link
Author

Upstream branch: c03320a
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=928755
version: 7

@kernel-patches-daemon-bpf-rc
Copy link
Author

Upstream branch: 57e71f8
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=928755
version: 7

@kernel-patches-daemon-bpf-rc
Copy link
Author

Upstream branch: 9af5c78
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=928755
version: 7

@kernel-patches-daemon-bpf-rc
Copy link
Author

Upstream branch: 5b67071
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=928755
version: 7

@kernel-patches-daemon-bpf-rc
Copy link
Author

Upstream branch: 03f3aa4
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=928755
version: 7

@kernel-patches-daemon-bpf-rc
Copy link
Author

At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=928755 expired. Closing PR.

@kernel-patches-daemon-bpf-rc
Copy link
Author

Upstream branch: 0abff46
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=930537
version: 8

@kernel-patches-daemon-bpf-rc
Copy link
Author

Upstream branch: 003be25
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=930537
version: 8

@kernel-patches-daemon-bpf-rc
Copy link
Author

At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=930537 expired. Closing PR.

Users can write the following code to enable the bpf extension:
int flags = SK_BPF_CB_TX_TIMESTAMPING;
int opts = SK_BPF_CB_FLAGS;
bpf_setsockopt(skops, SOL_SOCKET, opts, &flags, sizeof(flags));

Signed-off-by: Jason Xing <[email protected]>
Later, four callback points to report information to user space
based on this patch will be introduced.

As to skb initialization here, users can follow these three steps
as below to fetch the shared info from the exported skb in the bpf
prog:
1. skops_kern = bpf_cast_to_kern_ctx(skops);
2. skb = skops_kern->skb;
3. shinfo = bpf_core_cast(skb->head + skb->end, struct skb_shared_info);

More details can be seen in the last selftest patch of the series.

Signed-off-by: Jason Xing <[email protected]>
The "is_locked_tcp_sock" flag is added to indicate that the callback
site has a tcp_sock locked.

Apply the new member is_locked_tcp_sock in the existing callbacks
where is_fullsock is set to 1 can stop UDP socket accessing struct
tcp_sock and stop TCP socket without sk lock protecting does the
similar thing, or else it could be catastrophe leading to panic.

To keep it simple, instead of distinguishing between read and write
access, users aren't allowed all read/write access to the tcp_sock
through the older bpf_sock_ops ctx. The new timestamping callbacks
can use newer helpers to read everything from a sk (e.g. bpf_core_cast),
so nothing is lost.

Signed-off-by: Jason Xing <[email protected]>
Considering the potential invalid access issues, calling
bpf_sock_ops_setsockopt/getsockopt, bpf_sock_ops_cb_flags_set,
and the bpf_sock_ops_load_hdr_opt in the new timestamping
callbacks will return -EOPNOTSUPP error value.

It also prevents the UDP socket trying to access TCP fields in
the bpf extension for SO_TIMESTAMPING for the same consideration.

Signed-off-by: Jason Xing <[email protected]>
No functional changes here, only add test to see if the orig_skb
matches the usage of application SO_TIMESTAMPING. And it's good to
support two modes in parallel later in this series.

Signed-off-by: Jason Xing <[email protected]>
Support SCM_TSTAMP_SCHED case. Introduce SKBTX_BPF used as
an indicator telling us whether the skb should be traced
by the bpf prog.

Signed-off-by: Jason Xing <[email protected]>
Support sw SCM_TSTAMP_SND case. Then users will get the software
timestamp when the driver is about to send the skb. Later,
the hardware timestamp will be supported.

Signed-off-by: Jason Xing <[email protected]>
Support hw SCM_TSTAMP_SND case. Then bpf program can fetch the
hwstamp from skb directly.

To avoid changing so many callers using SKBTX_HW_TSTAMP from drivers,
replace SKBTX_HW_TSTAMP with SKBTX_HW_TSTAMP_NOBPF.

Signed-off-by: Jason Xing <[email protected]>
Support the ACK timestamp case. Extend txstamp_ack to two bits:
1 stands for SO_TIMESTAMPING mode, 2 bpf extension. The latter
will be used later.

Signed-off-by: Jason Xing <[email protected]>
Introduce the callback to correlate tcp_sendmsg timestamp with other
points, like SND/SW/ACK. For instance, let bpf prog trace the beginning
of tcp_sendmsg_locked() and then store the sendmsg timestamp at
the bpf_sk_storage, so that in tcp_tx_timestamp() we can correlate
the timestamp with tskey which can be found in other sending points.
More details can be found in the selftest.

Signed-off-by: Jason Xing <[email protected]>
Use __bpf_kfunc feature to allow bpf prog dynamically and selectively
to sample/track the skb. For example, the bpf prog will limit tracking
X numbers of packets and then will stop there instead of tracing
all the sendmsgs of matched flow all along.

Signed-off-by: Jason Xing <[email protected]>
…eature

Bpf prog calculates a couple of latency delta between each tx points
which SO_TIMESTAMPING feature has already implemented. It can be used
in the real world to diagnose the behaviour in the tx path.

Also, check the safety issues by accessing a few bpf calls in
bpf_test_access_bpf_calls().

There remains a few realistic things[1][2] to highlight:
1. in general a packet may pass through multiple qdiscs. For instance
with bonding or tunnel virtual devices in the egress path.
2. packets may be resent, in which case an ACK might precede a repeat
SCHED and SND.
3. erroneous or malicious peers may also just never send an ACK.

[1]: https://lore.kernel.org/all/[email protected]/
[2]: https://lore.kernel.org/all/[email protected]/

Signed-off-by: Jason Xing <[email protected]>
@kernel-patches-daemon-bpf-rc
Copy link
Author

Upstream branch: 9b6cdaf
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=931879
version: 9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant