You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Sep 2, 2023. It is now read-only.
In register-based IBF/EBF, the <offset:width> arguments are packed in a register as follows:
Bits
Meaning
0..4
offset
8..12
width
Use an IBF (or PACK.H?) instruction to pack the arguments into a single register before issuing the intended instruction (i.e. only two instructions in total).
See define_insn "*ibfsi3" and friends in mrisc32.md for the current immediate-based versions.
Hi all,
We noticed that calls to the vadcq and vsbcq intrinsics, both of
which use __builtin_arm_set_fpscr_nzcvqc to set the Carry flag in
the FPSCR, would produce the following code:
```
< r2 is the *carry input >
vmrs r3, FPSCR_nzcvqc
bic r3, r3, #536870912
orr r3, r3, r2, lsl #29
vmsr FPSCR_nzcvqc, r3
```
when the MVE ACLE instead gives a different instruction sequence of:
```
< Rt is the *carry input >
VMRS Rs,FPSCR_nzcvqc
BFI Rs,Rt,#29,#1
VMSR FPSCR_nzcvqc,Rs
```
the bic + orr pair is slower and it's also wrong, because, if the
*carry input is greater than 1, then we risk overwriting the top two
bits of the FPSCR register (the N and Z flags).
This turned out to be a problem in the header file and the solution was
to simply add a `& 1x0u` to the `*carry` input: then the compiler knows
that we only care about the lowest bit and can optimise to a BFI.
Ok for trunk?
Thanks,
Stam Markianos-Wright
gcc/ChangeLog:
* config/arm/arm_mve.h (__arm_vadcq_s32): Fix arithmetic.
(__arm_vadcq_u32): Likewise.
(__arm_vadcq_m_s32): Likewise.
(__arm_vadcq_m_u32): Likewise.
(__arm_vsbcq_s32): Likewise.
(__arm_vsbcq_u32): Likewise.
(__arm_vsbcq_m_s32): Likewise.
(__arm_vsbcq_m_u32): Likewise.
* config/arm/mve.md (get_fpscr_nzcvqc): Make unspec_volatile.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/mve_vadcq_vsbcq_fpscr_overwrite.c: New.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
In register-based
IBF
/EBF
, the<offset:width>
arguments are packed in a register as follows:Use an
IBF
(orPACK.H
?) instruction to pack the arguments into a single register before issuing the intended instruction (i.e. only two instructions in total).See
define_insn "*ibfsi3"
and friends inmrisc32.md
for the current immediate-based versions.Example C++ code: https://godbolt.org/z/6GWzq8hxa
The text was updated successfully, but these errors were encountered: