Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wireless issues - Atheros AR9280 HostAP mode on 5GHz - OPNsense crashes #190

Open
2 tasks done
adamast0r opened this issue Nov 26, 2023 · 5 comments
Open
2 tasks done
Assignees
Labels
upstream Third party issue

Comments

@adamast0r
Copy link

adamast0r commented Nov 26, 2023

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

Describe the bug

When using a very supported wireless card (WLE200NX (Atheros AR9280) in AP mode in 5HGz OPNsense will crash. The wireless card is not defective since I got recently a new Atheros AR9280 which is not even from the same vendor, both crash with the same frequency. This issue has been discussed on other ticket related with the web interface problems configuring 5GHz - opnsense/core#5765

To Reproduce

Steps to reproduce the behavior:

  1. Go to Interfaces: Wireless: Devices -> add new device
  2. Select 5HGz
  3. Configure as HostAP mode

Alternatively just configure in the command line:

ifconfig ath0_wlan1 down;
ifconfig ath0_wlan1 mode 11na;
ifconfig ath0_wlan1 channel 36:ht
ifconfig ath0_wlan1 up;

In both configurations it is likelly that the OPNsense will crash, there are exceptions when it does not crash and that keeps working fine.

Expected behavior

Not crash when a wireless AP is configured for 5GHz.

Describe alternatives you considered

No alternative, just try multiple times - crash , reboot and try again.

Relevant log files

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address	= 0x0
fault code		= supervisor write data, page not present
instruction pointer	= 0x20:0xffffffff80e1f4b0
stack pointer	        = 0x28:0xfffffe0062c95d40
frame pointer	        = 0x28:0xfffffe0062c95d90
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 12 (irq40: ath0)
trap number		= 12
panic: page fault
cpuid = 2
time = 1698872847
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0062c95b00
vpanic() at vpanic+0x151/frame 0xfffffe0062c95b50
panic() at panic+0x43/frame 0xfffffe0062c95bb0
trap_fatal() at trap_fatal+0x387/frame 0xfffffe0062c95c10
trap_pfault() at trap_pfault+0x4f/frame 0xfffffe0062c95c70
calltrap() at calltrap+0x8/frame 0xfffffe0062c95c70
--- trap 0xc, rip = 0xffffffff80e1f4b0, rsp = 0xfffffe0062c95d40, rbp = 0xfffffe0062c95d90 ---
ieee80211_beacon_update() at ieee80211_beacon_update+0x7f0/frame 0xfffffe0062c95d90
ath_beacon_generate() at ath_beacon_generate+0x46/frame 0xfffffe0062c95de0
ath_beacon_proc() at ath_beacon_proc+0x241/frame 0xfffffe0062c95e30
ath_intr() at ath_intr+0x4b3/frame 0xfffffe0062c95e60
ithread_loop() at ithread_loop+0x25a/frame 0xfffffe0062c95ef0
fork_exit() at fork_exit+0x7e/frame 0xfffffe0062c95f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0062c95f30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic

Additional context

There has been a discussion on a previous ticket - opnsense/core#5765. I am just creating this one so this is better tracked. @fichtner has already some patch that can be tested.

Environment

OPNsense 23.7.8_1-amd64 (amd64, OpenSSL).
PCengines APU2 with WLE200NX a/b/g/n wireless card - https://www.pcengines.ch/wle200nx.htm
Other Generic AR9280 card not from Compex

@fichtner fichtner transferred this issue from opnsense/core Nov 27, 2023
@fichtner fichtner self-assigned this Nov 27, 2023
@fichtner fichtner added the upstream Third party issue label Nov 27, 2023
@fichtner
Copy link
Member

@adamast0r the patch is ad8f010 and the kernel can be installed with this command:

# opnsense-update -zkr 23.7.8-ath

(requires a reboot)

Cheers,
Franco

@adamast0r
Copy link
Author

amazing news @fichtner I will try this later today, some questions about the kernel change:

  • I assume there are no requirements on the previous kernel/version I need to have? currently on 23.7.8_1
  • Do you have any OPNsense documentation how to recovery if the kernel does not boot?
  • Would the next update from 23.7.8_1 to any newer images (23.7.9 is the next) work with this kernel?
  • How do I revert back to the default kernel, if needed?

Thanks!

@fichtner
Copy link
Member

fichtner commented Nov 27, 2023

these targeted kernel changes are pretty safe. this works on any 23.7.x. If the kernel should not boot you can select "kernel.old" from the boot menu using a console, but that is very unlikely.. corruption of the disk preventing boot is much more likely also that's the standard risk for any kernel update so not that high as you probably can already attest to.

If you don't think the kernel is working for you reverting back to the current kernel is easy:

# opnsense-update -k

(it will pick up the latest kernel automatically).

The next kernel update will strip this custom kernel install. If it's included in 23.7.10 I am not sure yet... it heavily depends on your testing. If the kernel works and you want to prevent it from being stripped during 23.7.10 upgrade you can select "lock" for the kernel package under System: Firmware: Packages and the kernel will be left as is (until unlocked again or moving to the next major version).

Cheers,
Franco

@adamast0r
Copy link
Author

adamast0r commented Nov 27, 2023

I have updated the kernel:

# uname -a
FreeBSD OPNsense.localdomain 13.2-RELEASE-p5 FreeBSD 13.2-RELEASE-p5 ath_beacon-n254863-ad8f0103906 SMP amd64

Unfortunately I am getting a very similar kernel page fault as last time:

Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 03
fault virtual address	= 0x0
fault code		= supervisor write data, page not present
instruction pointer	= 0x20:0xffffffff80e1f5e0
stack pointer	        = 0x28:0xfffffe0062cc2d40
frame pointer	        = 0x28:0xfffffe0062cc2d90
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 12 (irq40: ath1)
trap number		= 12
panic: page fault
cpuid = 3
time = 1701108473
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0062cc2b00
vpanic() at vpanic+0x151/frame 0xfffffe0062cc2b50
panic() at panic+0x43/frame 0xfffffe0062cc2bb0
trap_fatal() at trap_fatal+0x387/frame 0xfffffe0062cc2c10
trap_pfault() at trap_pfault+0x4f/frame 0xfffffe0062cc2c70
calltrap() at calltrap+0x8/frame 0xfffffe0062cc2c70
--- trap 0xc, rip = 0xffffffff80e1f5e0, rsp = 0xfffffe0062cc2d40, rbp = 0xfffffe0062cc2d90 ---
ieee80211_beacon_update() at ieee80211_beacon_update+0x7f0/frame 0xfffffe0062cc2d90
ath_beacon_generate() at ath_beacon_generate+0x60/frame 0xfffffe0062cc2de0
ath_beacon_proc() at ath_beacon_proc+0x241/frame 0xfffffe0062cc2e30
ath_intr() at ath_intr+0x4b3/frame 0xfffffe0062cc2e60
ithread_loop() at ithread_loop+0x25a/frame 0xfffffe0062cc2ef0
fork_exit() at fork_exit+0x7e/frame 0xfffffe0062cc2f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0062cc2f30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic

@adamast0r
Copy link
Author

@fichtner I am able to trigger the bug very frequently when I run my script to start wireless over 5GHz , do we have any way of gather more debug information to understand the problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
upstream Third party issue
Development

No branches or pull requests

2 participants