You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The intrinsics for scalar converts between fp16 and integer/fixed-point values are currently specified to always use 16bit -> 16bit converts, regardless of the size of the integer/fixed-point value. For example:
float16_t vcvth_f16_s32(int32_t a) a -> Hn SCVTF Hd,Hn Hd -> result A32/A64
Using the instruction listed causes certain input values to be treated incorrectly. For the above example, an input int32 65504 produces an fp16 value of -32.0, instead of the expected 65504.0.
Instead, the above intrinsic should use the SCVTF Hd,Wn instruction, which better matches the input type.
This applies to all scalar converts between fp16 and 32-bit/64-bit integer/fixed-point converts:
Testing with two mainstream compilers (gcc and clang/llvm) shows that these intrinsics are often already generating the proposed instructions, rather than the instructions listed in the ACLE. In particular:
GCC (tested with 9.2.0) generates the proposed instructions for all of the intrinsics.
clang/llvm (tested with 14.0.0) generates the proposed instructions for the integer converts, but generates the ACLE instructions for the fixed-point converts.
The text was updated successfully, but these errors were encountered:
The intrinsics for scalar converts between fp16 and integer/fixed-point values are currently specified to always use 16bit -> 16bit converts, regardless of the size of the integer/fixed-point value. For example:
acle/tools/intrinsic_db/advsimd.csv
Line 3795 in 6eb8516
Using the instruction listed causes certain input values to be treated incorrectly. For the above example, an input int32 65504 produces an fp16 value of -32.0, instead of the expected 65504.0.
Instead, the above intrinsic should use the
SCVTF Hd,Wn
instruction, which better matches the input type.This applies to all scalar converts between fp16 and 32-bit/64-bit integer/fixed-point converts:
Testing with two mainstream compilers (gcc and clang/llvm) shows that these intrinsics are often already generating the proposed instructions, rather than the instructions listed in the ACLE. In particular:
GCC (tested with 9.2.0) generates the proposed instructions for all of the intrinsics.
clang/llvm (tested with 14.0.0) generates the proposed instructions for the integer converts, but generates the ACLE instructions for the fixed-point converts.
The text was updated successfully, but these errors were encountered: