Skip to content

netdev CI testing #6666

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1,661 commits into
base: bpf-next_base
Choose a base branch
from
Open

Conversation

kuba-moo
Copy link
Contributor

Reusable PR for hooking netdev CI to BPF testing.

@kernel-patches-daemon-bpf kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch 3 times, most recently from 4f22ee0 to 8a9a8e0 Compare March 28, 2024 04:46
@kuba-moo kuba-moo force-pushed the to-test branch 11 times, most recently from 64c403f to 8da1f58 Compare March 29, 2024 00:01
@kernel-patches-daemon-bpf kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch 3 times, most recently from 78ebb17 to 9325308 Compare March 29, 2024 02:14
@kuba-moo kuba-moo force-pushed the to-test branch 6 times, most recently from c8c7b2f to a71aae6 Compare March 29, 2024 18:01
@kuba-moo kuba-moo force-pushed the to-test branch 2 times, most recently from d8feb00 to b16a6b9 Compare March 30, 2024 00:01
@kuba-moo kuba-moo force-pushed the to-test branch 2 times, most recently from 4164329 to c5cecb3 Compare March 30, 2024 06:00
Feng Liu and others added 29 commits July 22, 2025 20:00
Underneath "TIS Config" tag expose TIS diagnostic information.
Expose the tisn of each TC under each lag port.

$ sudo devlink health diagnose auxiliary/mlx5_core.eth.2/131072 reporter tx
......
  TIS Config:
      lag port: 0 tc: 0 tisn: 0
      lag port: 1 tc: 0 tisn: 8
......

Signed-off-by: Feng Liu <[email protected]>
Reviewed-by: Aya Levin <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
TCA_MQPRIO_TC_ENTRY_INDEX is validated using
NLA_POLICY_MAX(NLA_U32, TC_QOPT_MAX_QUEUE), which allows the value
TC_QOPT_MAX_QUEUE (16). This leads to a 4-byte out-of-bounds stack write in
the fp[] array, which only has room for 16 elements (0–15).

Fix this by changing the policy to allow only up to TC_QOPT_MAX_QUEUE - 1.

Fixes: f62af20 ("net/sched: mqprio: allow per-TC user input of FP adminStatus")
Reported-by: Maher Azzouzi <[email protected]>
Signed-off-by: Maher Azzouzi <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
The s390x ISM device data sheet clearly states that only one
request-response sequence is allowable per ISM function at any point in
time.  Unfortunately as of today the s390/ism driver in Linux does not
honor that requirement. This patch aims to rectify that.

This problem was discovered based on Aliaksei's bug report which states
that for certain workloads the ISM functions end up entering error state
(with PEC 2 as seen from the logs) after a while and as a consequence
connections handled by the respective function break, and for future
connection requests the ISM device is not considered -- given it is in a
dysfunctional state. During further debugging PEC 3A was observed as
well.

A kernel message like
[ 1211.244319] zpci: 061a:00:00.0: Event 0x2 reports an error for PCI function 0x61a
is a reliable indicator of the stated function entering error state
with PEC 2. Let me also point out that a kernel message like
[ 1211.244325] zpci: 061a:00:00.0: The ism driver bound to the device does not support error recovery
is a reliable indicator that the ISM function won't be auto-recovered
because the ISM driver currently lacks support for it.

On a technical level, without this synchronization, commands (inputs to
the FW) may be partially or fully overwritten (corrupted) by another CPU
trying to issue commands on the same function. There is hard evidence that
this can lead to DMB token values being used as DMB IOVAs, leading to
PEC 2 PCI events indicating invalid DMA. But this is only one of the
failure modes imaginable. In theory even completely losing one command
and executing another one twice and then trying to interpret the outputs
as if the command we intended to execute was actually executed and not
the other one is also possible.  Frankly, I don't feel confident about
providing an exhaustive list of possible consequences.

Fixes: 684b89b ("s390/ism: add device driver for internal shared memory")
Reported-by: Aliaksei Makarau <[email protected]>
Tested-by: Mahanta Jambigi <[email protected]>
Tested-by: Aliaksei Makarau <[email protected]>
Signed-off-by: Halil Pasic <[email protected]>
Reviewed-by: Alexandra Winter <[email protected]>
Signed-off-by: Alexandra Winter <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
Don't add _req to helper names for pure types. We don't currently
print those so it makes no difference to existing codegen.

Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
Just to avoid making the main function even more enormous,
before adding more things to print move the free printing
to a helper which already prints the type.

Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
In general YNL provides allocation and free helpers for types.
For pure nested structs which are used as multi-attr (and therefore
have to be allocated dynamically) we already print a free helper
as it's needed by free of the containing struct.

Add printing of the alloc helper for consistency. The helper
takes the number of entries to allocate as an argument, e.g.:

  static inline struct netdev_queue_id *netdev_queue_id_alloc(unsigned int n)
  {
	return calloc(n, sizeof(struct netdev_queue_id));
  }

Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
For basic types we "flatten" setters. If a request "a" has a simple
nest "b" with value "val" we print helpers like:

 req_set_a_b(struct a *req, int val)
 {
   req->_present.a = 1;
   req->b._present.val = 1;
   req->b.val = ...
 }

This is not possible for multi-attr because they have to be allocated
dynamically by the user. Print "object level" setters so that user
preparing the object doesn't have to futz with the presence bits
and other YNL internals.

Add the ability to pass in the variable name to generated setters.
Using "req" here doesn't feel right, while the attr is part of a request
it's not the request itself, so it seems cleaner to call it "obj".

Example:

 static inline void
 netdev_queue_id_set_id(struct netdev_queue_id *obj, __u32 id)
 {
	obj->_present.id = 1;
	obj->id = id;
 }

Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
Use the just-added YNL helpers instead of manually setting
"_present" bits in the queue attrs. Compile tested only.

Signed-off-by: Jakub Kicinski <[email protected]>
Acked-by: Mina Almasry <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
My CC-adding automation returned nothing on a future patch to the
include/linux/in6.h file, and I went looking for why. Add the missed
in6.h to MAINTAINERS.

Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
Add documentation clarifying that ARP and routing UAPI structures are
constrained to IPv4-only usage, making them safe for the coming fixed-size
sockaddr conversion (with the 14-byte struct sockaddr::sa_data). These
are fine as-is, but their use was non-obvious to me, so I figured they
could use a little more documentation:

- struct arpreq: ARP protocol is IPv4-only by design
- struct rtentry: Legacy IPv4 routing API, IPv6 uses different structures

Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
There are cases in networking (e.g. wireguard, sctp) where a union is
used to provide coverage for either IPv4 or IPv6 network addresses,
and they include an embedded "struct sockaddr" as well (for "sa_family"
and raw "sa_data" access). The current struct sockaddr contains a
flexible array, which means these unions should not be further embedded
in other structs because they do not technically have a fixed size (and
are generating warnings for the coming -Wflexible-array-not-at-end flag
addition). But the future changes to make struct sockaddr a fixed size
(i.e. with a 14 byte sa_data member) make the "sa_data" uses with an IPv6
address a potential place for the compiler to get upset about object size
mismatches. Therefore, we need a sockaddr that cleanly provides both an
sa_family member and an appropriately fixed-sized sa_data member that does
not bloat member usage via the potential alternative of sockaddr_storage
to cover both IPv4 and IPv6, to avoid unseemly churn in the affected code
bases.

Introduce sockaddr_inet as a unified structure for holding both IPv4 and
IPv6 addresses (i.e. large enough to accommodate sockaddr_in6).

The structure is defined in linux/in6.h since its max size is sized
based on sockaddr_in6 and provides a more specific alternative to the
generic sockaddr_storage for IPv4 with IPv6 address family handling.

The "sa_family" member doesn't use the sa_family_t type to avoid needing
layer violating header inclusions.

Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
As part of the removal of the variably-sized sockaddr for kernel
internals, replace struct sockaddr with sockaddr_inet in the endpoint
union.

No binary changes; the union size remains unchanged due to sockaddr_inet
matching the size of sockaddr_in6.

Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
As part of the removal of the variably-sized sockaddr for kernel
internals, replace struct sockaddr with sockaddr_inet in the sctp_addr
union.

No binary changes; the union size remains unchanged due to sockaddr_inet
matching the size of sockaddr_in6.

Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
When building, the following warnings will appear.
"
pci_irq.c: In function ‘mlx5_ctrl_irq_request’:
pci_irq.c:494:1: warning: the frame size of 1040 bytes is larger than 1024 bytes [-Wframe-larger-than=]

pci_irq.c: In function ‘mlx5_irq_request_vector’:
pci_irq.c:561:1: warning: the frame size of 1040 bytes is larger than 1024 bytes [-Wframe-larger-than=]

eq.c: In function ‘comp_irq_request_sf’:
eq.c:897:1: warning: the frame size of 1080 bytes is larger than 1024 bytes [-Wframe-larger-than=]

irq_affinity.c: In function ‘irq_pool_request_irq’:
irq_affinity.c:74:1: warning: the frame size of 1048 bytes is larger than 1024 bytes [-Wframe-larger-than=]
"

These warnings indicate that the stack frame size exceeds 1024 bytes in
these functions.

To resolve this, instead of allocating large memory buffers on the stack,
it is better to use kvzalloc to allocate memory dynamically on the heap.
This approach reduces stack usage and eliminates these frame size warnings.

Acked-by: Junxian Huang <[email protected]>
Signed-off-by: Zhu Yanjun <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
The xdp_convert_buff_to_frame() function can return NULL when there is
insufficient headroom in the buffer to store the xdp_frame structure
or when the driver didn't reserve enough tailroom for skb_shared_info.

Currently, the sfc driver does not check for this NULL return value
in the XDP_TX case within efx_do_xdp(). While the efx_xdp_tx_buffers()
function has some defensive checks, passing a NULL xdpf can still lead
to undefined behavior when the function tries to access xdpf->len and
xdpf->data.

Fix by adding a proper NULL check in the XDP_TX case. If conversion
fails, free the RX buffer and increment the bad drops counter, following
the same pattern used for other XDP error conditions in this driver.

Signed-off-by: Chenyuan Yang <[email protected]>
Fixes: 1b698fa ("xdp: Rename convert_to_xdp_frame in xdp_convert_buff_to_frame")
Signed-off-by: NipaLocal <nipa@local>
The xdp_convert_buff_to_frame() function can return NULL when there is
insufficient headroom in the buffer to store the xdp_frame structure
or when the driver didn't reserve enough tailroom for skb_shared_info.

Currently, the sfc siena driver does not check for this NULL return value
in the XDP_TX case within efx_do_xdp().

Fix by adding a proper NULL check in the XDP_TX case. If conversion
fails, free the RX buffer and increment the bad drops counter, following
the same pattern used for other XDP error conditions in this driver.

Signed-off-by: Chenyuan Yang <[email protected]>
Fixes: d48523c ("sfc: Copy shared files needed for Siena (part 2)")
Signed-off-by: NipaLocal <nipa@local>
The xdp_convert_buff_to_frame() function can return NULL when there is
insufficient headroom in the buffer to store the xdp_frame structure
or when the driver didn't reserve enough tailroom for skb_shared_info.

Currently, the otx2 driver does not check for this NULL return value
in two critical paths within otx2_xdp_rcv_pkt_handler():

1. XDP_TX case: Passes potentially NULL xdpf to otx2_xdp_sq_append_pkt()
2. XDP_REDIRECT error path: Calls xdp_return_frame() with potentially NULL

This can lead to kernel crashes due to NULL pointer dereference.

Fix by adding proper NULL checks in both paths. For XDP_TX, return false
to indicate packet should be dropped. For XDP_REDIRECT error path, only
call xdp_return_frame() if conversion succeeded, otherwise manually free
the page.

Please correct me if any error path is incorrect.

This is similar to the commit cc3628d
("xen-netfront: handle NULL returned by xdp_convert_buff_to_frame()").

Signed-off-by: Chenyuan Yang <[email protected]>
Fixes: 94c80f7 ("octeontx2-pf: use xdp_return_frame() to free xdp buffers")
Signed-off-by: NipaLocal <nipa@local>
Move multiple copies of same code snippet doing `gro_flush` and
`gro_normal_list` into separate helper function.

Signed-off-by: Samiullah Khawaja <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
Prepare for adding an enum type for NAPI threaded states by adding
netif_threaded_enable API. De-export the existing netif_set_threaded API
and only use it internally. Update existing drivers to use
netif_threaded_enable instead of the de-exported netif_set_threaded.

Note that dev_set_threaded used by mt76 debugfs file is unchanged.

Signed-off-by: Samiullah Khawaja <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
Instead of using '0' and '1' for napi threaded state use an enum with
'disabled' and 'enabled' states.

Tested:
 ./tools/testing/selftests/net/nl_netdev.py
 TAP version 13
 1..7
 ok 1 nl_netdev.empty_check
 ok 2 nl_netdev.lo_check
 ok 3 nl_netdev.page_pool_check
 ok 4 nl_netdev.napi_list_check
 ok 5 nl_netdev.dev_set_threaded
 ok 6 nl_netdev.napi_set_threaded
 ok 7 nl_netdev.nsim_rxq_reset_down
 # Totals: pass:7 fail:0 xfail:0 xpass:0 skip:0 error:0

Signed-off-by: Samiullah Khawaja <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
tc_actions.sh keeps hanging the forwarding tests.

sdf@: tdc & tdc-dbg started intermittenly failing around Sep 25th

Signed-off-by: NipaLocal <nipa@local>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
Signed-off-by: NipaLocal <nipa@local>
Signed-off-by: NipaLocal <nipa@local>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: NipaLocal <nipa@local>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.