kernel: arch: move arch_swap() declaration #82454

peter-mitsis · 2024-12-02T23:57:41Z

Moves the arch_swap() declaration out of kernel_arch_interface.h and into the various architectures' kernel_arch_func.h. This permits the arch_swap() to be inlined on ARM, but extern'd on the other architectures that still implement arch_swap().

Inlining this function on ARM has shown at least a +5% performance boost according to the thread_metric benchmark on the disco_l475_iot1 board.

At the time of creating this PR, mainline results for the thread_metric w/ multiq on the disco_l475_iot1 ..
Preemptive: 7051317, Cooperative: 12436712
With this PR ...
Preemptive: 7417390, Cooperative: 13188390

The new preemptive numbers should put us a little ahead of ThreadX on the same hardware.

aescolar

Ok by me. Just a minor thing and a nit.

kernel/include/kernel_arch_interface.h

arch/mips/include/kernel_arch_func.h

andyross

Seems very reasonable.

kernel/include/kernel_arch_interface.h

peter-mitsis · 2024-12-04T02:45:21Z

Fixed for ARM builds.

peter-mitsis · 2024-12-04T18:05:56Z

Hmmm ... something weird is going on with the arch.arm.swap.common.fpu_sharing test on the mps2/an521/cpu1 board.

It fails with this patch, and passes without. However, the non-optimized version of the test (arch.arm.swap.common.fpu_sharing.no_optimizations) passes. Both scenarios are reproducible on my dev box. I am hoping that there is something off with the test, but I am digging into this to find out.

peter-mitsis · 2024-12-04T19:26:02Z

I am leaning towards thinking that this failure is a test failure. In the optimized version of this test (using -0s), we are getting garbage values for the v1..v8 registers, and some of the routines for initializing data are just absent. It looks like the compiler is optimizing them away. I think we some way to keep them in the code. Continuing to dig and experiment.

peter-mitsis · 2024-12-04T21:24:21Z

I think I see what is going on in the failing test now ...

In the failing test, it makes a call to arch_swap(). Previously, this would be a function call and the test was written for that. However, with arch_swap() now being inlined, this changes what the compiler generates just enough so that when alt_thread checks the registers, r5/v2 is 0 instead of what it had previously saved and is expecting.

I have an idea about how to work around this ...

Moves the arch_swap() declaration out of kernel_arch_interface.h and into the various architectures' kernel_arch_func.h. This permits the arch_swap() to be inlined on ARM, but extern'd on the other architectures that still implement arch_swap(). Inlining this function on ARM has shown at least a +5% performance boost according to the thread_metric benchmark on the disco_l475_iot1 board. Signed-off-by: Peter Mitsis <[email protected]>

andyross · 2024-12-04T23:24:23Z

FYI, "v1-v8" (which are IMHO needlessly confusing aliases for r4-r11) are the caller-save registers in the ARM ABI. It's likely that the earlier swap routine was written to assume that the caller had already spilled them and that they don't have to be saved. In which case you might as well give up for ARM; the routine wasn't written to be legally inlined.

I mean, one could "fix" it, but only at the cost of adding back all the spills the compiler was generating before. And that's worse and not better, as usually the compiler is really good about spill/fill logic (e.g. finding registers that don't actually need to be saved), where a context switch is forced to be conservative/pessimal and save everything.

andyross · 2024-12-04T23:26:08Z

Also, as far as rewriting ARM Cortex M swap: I already claim that spot as soon as I can get MTK work submitted SOF-side. Hold my beer, as it were.

peter-mitsis · 2024-12-05T20:12:16Z

Also, as far as rewriting ARM Cortex M swap: I already claim that spot as soon as I can get MTK work submitted SOF-side.

I am looking forward to your rewrite. I doubt that my proposed commit will have a long life as I anticipate your work to supersede it, but should it take longer than anticipated we at least have an interim boost.

zephyrbot added area: ARM ARM (32-bit) Architecture area: X86 x86 Architecture (32-bit) area: NIOS2 NIOS2 Architecture area: native port Host native arch port (native_sim) area: MIPS area: Kernel labels Dec 2, 2024

zephyrbot requested review from aescolar, andyross, bbolen, carlocaione, ceolin, dcpleung, edersondisouza, galak, ithinuel, laurenmurphyx64, MaureenHelm, microbuilder, najumon1980, nashif and stephanosio December 2, 2024 23:58

zephyrbot assigned ithinuel Dec 2, 2024

aescolar reviewed Dec 3, 2024

View reviewed changes

kernel/include/kernel_arch_interface.h Show resolved Hide resolved

arch/mips/include/kernel_arch_func.h Outdated Show resolved Hide resolved

peter-mitsis force-pushed the pmitsis-inline-arch_swap branch from dc33678 to dcaad72 Compare December 3, 2024 23:44

peter-mitsis requested a review from aescolar December 3, 2024 23:45

andyross previously approved these changes Dec 3, 2024

View reviewed changes

kernel/include/kernel_arch_interface.h Show resolved Hide resolved

peter-mitsis dismissed andyross’s stale review via b4171dc December 4, 2024 02:42

peter-mitsis force-pushed the pmitsis-inline-arch_swap branch from dcaad72 to b4171dc Compare December 4, 2024 02:42

peter-mitsis force-pushed the pmitsis-inline-arch_swap branch from b4171dc to 2eb582a Compare December 4, 2024 22:56

nashif approved these changes Dec 5, 2024

View reviewed changes

aescolar approved these changes Dec 9, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kernel: arch: move arch_swap() declaration #82454

kernel: arch: move arch_swap() declaration #82454

peter-mitsis commented Dec 2, 2024

aescolar left a comment

andyross left a comment

peter-mitsis commented Dec 4, 2024

peter-mitsis commented Dec 4, 2024

peter-mitsis commented Dec 4, 2024

peter-mitsis commented Dec 4, 2024

andyross commented Dec 4, 2024

andyross commented Dec 4, 2024

peter-mitsis commented Dec 5, 2024

kernel: arch: move arch_swap() declaration #82454

Are you sure you want to change the base?

kernel: arch: move arch_swap() declaration #82454

Conversation

peter-mitsis commented Dec 2, 2024

aescolar left a comment

Choose a reason for hiding this comment

andyross left a comment

Choose a reason for hiding this comment

peter-mitsis commented Dec 4, 2024

peter-mitsis commented Dec 4, 2024

peter-mitsis commented Dec 4, 2024

peter-mitsis commented Dec 4, 2024

andyross commented Dec 4, 2024

andyross commented Dec 4, 2024

peter-mitsis commented Dec 5, 2024