Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling NIC RSS causes localhost services to timeout. #195

Open
2 tasks done
Sv0i opened this issue Jan 3, 2024 · 0 comments
Open
2 tasks done

Enabling NIC RSS causes localhost services to timeout. #195

Sv0i opened this issue Jan 3, 2024 · 0 comments
Labels
support Community support

Comments

@Sv0i
Copy link

Sv0i commented Jan 3, 2024

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

Describe the bug

When you enable RSS, localhost services are acting as there is no any network connectivity.
(https://docs.opnsense.org/manual/vpnet.html#tuning-considerations)

To Reproduce

Steps to reproduce the behavior:

  1. Requires a host with at least 2 CPU cores and network adapter that support RSS (we have tested on Intel X710 and Intel I354) and all network adapters hardware accelerations are enabled.
  2. Setup the environment with basic configuration.
  3. Check for firmware updates - should be successful.
  4. Check DNS resolution - should be successful.
  5. Follow "Tuning considerations" section from the OPNsense manual and configure the setting described according to your chosen host setup: https://docs.opnsense.org/manual/vpnet.html#tuning-considerations
  6. Restart your host as required per the manual.
  7. Check for firmware updates - times out.
  8. Check DNS resolution - times out.

Expected behavior

After step 6., step 7. & 8. should act as step 3. & 4.

Describe alternatives you considered

Installed DNSMasq. It seems to operate on both CPU cores and when the DNS reply packet is received on another core (not the sending one), the DNS resolution is successful. However this is only to overcome the DNS resolution only. Firmware upgrade continues to timeout.

We found that if in addition to above setup, if you have IPSec tunnels setup in Route based (VTI) mode as described here: https://docs.opnsense.org/manual/vpnet.html#route-based-vti
And then if your default route is via the VTI interfaces, rather than a physical interface, then the firmware update and other localhost operations are working.

Screenshots

N/A

Relevant log files

N/A

Additional context

The RSS setup does not break any passing traffic or locally originating firewall or routing communications.
The issue happens when net.isr.bindthreads is enabled.
Then the locally running and executed applications/services/scripts transmit from their own thread core and expect the received packed on the same core. However the RSS settings will usually send the packet to another core where nothing is expecting it and it will not be forwarded/send to the correct thread core.
DNSMasq appears to have a work around against this issue by operating in multithreaded mode perhaps and is able to process the received on another core reply. However this is not the case with perl scripts, Unbound and many more.
tcpdump demonstrated that the packets are being send and replies are being received by the host and the logs/errors/warning messages are always actins as there is no connectivity at all.

Environment

OPNsense from 23.1-amd64 to 23.7.10_1

Multiple platform types are tested and affected.
CPU core counts >= 2 (to use NIC RSS)
Network adapters tested: Intel I354, Intel X710

@fichtner fichtner transferred this issue from opnsense/core Jan 3, 2024
@fichtner fichtner added the support Community support label Jan 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
support Community support
Development

No branches or pull requests

2 participants