Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hanging with “rcu_preempt self-detected stall on CPU” when SCCD sector size is misconfigured #1473

Open
delan opened this issue Jul 19, 2024 · 6 comments

Comments

@delan
Copy link
Contributor

delan commented Jul 19, 2024

Info

  • Which version of Pi are you using: Raspberry Pi 4 Model B (1GB)
  • Which github revision of software: v24.04.01
  • Which board version: akuker 2.6d
  • Which computer is the PiSCSI connected to: Sun SPARCstation 5
  • Which OS you are using (output of 'lsb_release -a'): official image 2024-04-30-PiSCSI-v24.04.01-arm64-lite.zip

Describe the issue

Whenever I try booting from an emulated CDROM on my SPARCstation 5, the Pi becomes almost entirely unresponsive, including web requests and ssh sessions. Full logs here, but the kernel throws this error, repeating every 63 seconds…

Jun 30 19:53:30 piscsi kernel: rcu: INFO: rcu_preempt self-detected stall on CPU
Jun 30 19:53:30 piscsi kernel: rcu:         2-....: (5249 ticks this GP) idle=72bc/1/0x4000000000000000 softirq=4459/4463 fqs=2625
Jun 30 19:53:30 piscsi kernel:         (t=5250 jiffies g=10725 q=200 ncpus=4)
Jun 30 19:53:30 piscsi kernel: CPU: 2 PID: 811 Comm: (d-logind) Tainted: G         C         6.1.21-v8+ #1642
Jun 30 19:53:30 piscsi kernel: Hardware name: Raspberry Pi 4 Model B Rev 1.5 (DT)
Jun 30 19:53:30 piscsi kernel: pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
Jun 30 19:53:30 piscsi kernel: pc : smp_call_function_many_cond+0x1ac/0x3d8
Jun 30 19:53:30 piscsi kernel: lr : smp_call_function_many_cond+0x168/0x3d8
Jun 30 19:53:30 piscsi kernel: sp : ffffffc00973bb00
Jun 30 19:53:30 piscsi kernel: x29: ffffffc00973bb00 x28: 0000000000000003 x27: 0000000000000004
Jun 30 19:53:30 piscsi kernel: x26: ffffff803b19fec8 x25: ffffffe5aa358ae0 x24: 0000000000000002
Jun 30 19:53:30 piscsi kernel: x23: ffffffe5aa3596e8 x22: 0000000000000000 x21: ffffff803b19fec8
Jun 30 19:53:30 piscsi kernel: x20: ffffff803b19fec0 x19: ffffffe5aa3596e8 x18: 0000000000000014
Jun 30 19:53:30 piscsi kernel: x17: 00000000206d61fa x16: 00000000c9a2f191 x15: 000000004fff85b3
Jun 30 19:53:30 piscsi kernel: x14: 0000000000000091 x13: 0000000000000007 x12: 000000000000000b
Jun 30 19:53:30 piscsi kernel: x11: 0000000000000007 x10: 0000000000000000 x9 : ffffffe5a9104960
Jun 30 19:53:30 piscsi kernel: x8 : ffffff803b19fef0 x7 : 0000000000000000 x6 : ffffff803b1c1320
Jun 30 19:53:30 piscsi kernel: x5 : ffffff803b1c1320 x4 : 0000000000000000 x3 : 0000000000000000
Jun 30 19:53:30 piscsi kernel: x2 : ffffff803b1c1328 x1 : 0000000000000011 x0 : 0000000000000003
Jun 30 19:53:30 piscsi kernel: Call trace:
Jun 30 19:53:30 piscsi kernel:  smp_call_function_many_cond+0x1ac/0x3d8
Jun 30 19:53:30 piscsi kernel:  smp_call_function+0x50/0x80
Jun 30 19:53:30 piscsi kernel:  kick_all_cpus_sync+0x2c/0x38
Jun 30 19:53:30 piscsi kernel:  bpf_int_jit_compile+0x198/0x608
Jun 30 19:53:30 piscsi kernel:  bpf_prog_select_runtime+0x124/0x198
Jun 30 19:53:30 piscsi kernel:  bpf_prepare_filter+0x4b0/0x528
Jun 30 19:53:30 piscsi kernel:  bpf_prog_create_from_user+0x13c/0x1d0
Jun 30 19:53:30 piscsi kernel:  do_seccomp+0x28c/0xa18
Jun 30 19:53:30 piscsi kernel:  __arm64_sys_seccomp+0x28/0x38
Jun 30 19:53:30 piscsi kernel:  invoke_syscall+0x4c/0x110
Jun 30 19:53:30 piscsi kernel:  el0_svc_common.constprop.3+0xfc/0x120
Jun 30 19:53:30 piscsi kernel:  do_el0_svc+0x34/0xd0
Jun 30 19:53:30 piscsi kernel:  el0_svc+0x30/0x88
Jun 30 19:53:30 piscsi kernel:  el0t_64_sync_handler+0x98/0xc0
Jun 30 19:53:30 piscsi kernel:  el0t_64_sync+0x18c/0x190

…and the reads start timing out on the SPARCstation:

PXL_20240630_114705209 RAW-01 MP COVER

I can test my PiSCSI with another SPARCstation 5 or with a PC HBA if needed.

@delan
Copy link
Contributor Author

delan commented Jul 19, 2024

The errors I’m seeing on the SPARCstation smell like I have the wrong sector size (compare ZuluSCSI/ZuluSCSI-firmware#438). I could have sworn I selected Toshiba XM-3401TA, but I guess not:

Jun 30 12:43:56 piscsi PISCSI[501]: [2024-06-30 12:43:56.108] [info] Validating: operation=ATTACH, command params='locale=en', 'token=???', device=6:0, type=SCCD, device params='file=solaris_2.5.1_1197.iso', vendor='', product='', revision='', block size=0
Jun 30 12:43:56 piscsi PISCSI[501]: [2024-06-30 12:43:56.111] [info] Executing: operation=ATTACH, command params='locale=en', 'token=???', device=6:0, type=SCCD, device params='file=solaris_2.5.1_1197.iso', vendor='', product='', revision='', block size=0
Jun 30 12:43:56 piscsi PISCSI[501]: [2024-06-30 12:43:56.112] [info] Attached read-only SCCD 6:0

I’ll update the title and create a new issue if I run into anything else.

@delan delan changed the title Hanging with “rcu_preempt self-detected stall on CPU” Hanging with “rcu_preempt self-detected stall on CPU” when SCCD sector size is misconfigured Jul 19, 2024
@rdmark
Copy link
Member

rdmark commented Jul 27, 2024

@delan So if I understand this correctly, if you configure the SCCD with the correct sector size that your SPARCstation expects, you get past this issue?

@delan
Copy link
Contributor Author

delan commented Jul 28, 2024

@delan So if I understand this correctly, if you configure the SCCD with the correct sector size that your SPARCstation expects, you get past this issue?

Yeah, this seems to only happen when the sector size is misconfigured.

@rdmark
Copy link
Member

rdmark commented Jul 28, 2024

Got it! So in this case, piscsi works as intended, and what you observed is akin to hooking up an incompatible consumer grade CD-ROM drive to your UNIX workstation.

As I mentioned in the other ticket, we have defined "device properties" for a handful of known good Sun-compatible CD-ROM drives that will give you both the correct sector size as well as INQUIRE vendor strings.

@rdmark rdmark closed this as not planned Won't fix, can't repro, duplicate, stale Jul 28, 2024
@delan
Copy link
Contributor Author

delan commented Jul 30, 2024

Hmm, ok. I think it’s not ideal for piscsi to crash the whole pi with no real diagnostics or recourse if configured to emulate the wrong device. Hopefully others that run into this have an easier time figuring this out than I did.

@rdmark
Copy link
Member

rdmark commented Aug 10, 2024

Fair point. I read the report too quickly and thought the panic was in the host system.

@rdmark rdmark reopened this Aug 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants