-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scalar bitmanip and crypto intrinsics proposal #44
Conversation
here is a related issue: rvkrypto/rvkrypto-fips#11 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I support those intrinsic functions explicitly use the fixed length integer instead of xlen, that could prevent user to use wrong length to implement stuffs :)
LGTM, thanks. |
…tins. Part of an effort to remove uses of 'long' to mean XLen from the builtin interfaces. Also makes the builtin names match riscv-non-isa/riscv-c-api-doc#44. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D154681
Allow _32 builtin on RV64 since it only brev8+sext.w. Part of an effort to remove 'long' to mean XLen from the builtin interface. Matches the proposal here riscv-non-isa/riscv-c-api-doc#44 Reviewed By: asb Differential Revision: https://reviews.llvm.org/D154683
We can use an i64 clmul to emulate i32 clmul. For clmulh and clmulr we need to zero extend the 32 bit input to 64 bits then extract either bits [63:32] or [62:31]. Unfortunately, without Zba we need to use 2 shifts for the zero extends. These can be optimized out later if the producing instruction already zeroed the upper bits or if we can use lwu. There are alternative sequences we can use for clmulh/clmulr when the zero extend isn't free, but those are best handled by a DAG combine to give the best opportunity for removing the extend. This allows us to implement i32 clmul C intrinsics proposed in riscv-non-isa/riscv-c-api-doc#44. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D154729
I removed |
Unsigned is a better representation for bitmanipulation and cryptography.w The only exception being the return values for clz and ctz intrinsics is a signed int. That matches the target independent clz and ctz builtins. This is consistent with the current scalar crypto proposal riscv-non-isa/riscv-c-api-doc#44 Reviewed By: VincentWu Differential Revision: https://reviews.llvm.org/D154616
Previously we returned i32 on RV32 and i64 on RV64. The instructions only consume 32 bits and only produce 32 bits. For RV64, the result is sign extended to 64 bits like *W instructions. This patch removes this detail from the interface to improve portability and consistency. This matches the proposal for scalar intrinsics here riscv-non-isa/riscv-c-api-doc#44 I've included IR autoupgrade support as well. I'll be doing this for other builtins/intrinsics that currently use 'long' in other patches. Reviewed By: VincentWu Differential Revision: https://reviews.llvm.org/D154647
6ea26a1
to
8ffd6f6
Compare
I fixed the number of arguments to the xperm intrinsics. And I reduce the rotate amount for |
riscv-c-api.md
Outdated
|
||
Due to the data-independent latency ("constant time") assertions of | ||
the `Zkt` extension, the header file or the compiler can't use table | ||
lookups, conditional branching, etc., when implementing crypto intrinsics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually a useful guarantee? The compiler can do whatever it likes around the intrinsic. If you need data-independent timing then you need to write the entire thing in assembly...
What about adding an intrinsic that reads |
It's not guaranteed to be accessible to user level software. I don't think it belongs in this proposal. |
What do we need to do to move this proposal forward? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I think we have some bitmanip intrinsic work to do in GCC part, thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you very much work working this out!
We have discussed this in today's SIG toolchain call. The planned timeline is as follows:
|
Clang/llvm patch https://reviews.llvm.org/D155647 |
Are we adding new macros like |
We have a list in this repo, but it does not make sense to add each and every extension manually (which is the reason that nobody does it). So instead of a PR that adds |
@pz9115 is working on the GCC patches. |
This is a proposal for scalar crypto intrinsics for the Zk* extensions. I've removed the recommendation to use 'long' for XLEN. There's nothing guaranteeing that 'long' is XLEN. For example, if we were to support ILP32 on RV64, 'long' would become 32 bits while XLEN would still be 64 bits. Other design decisions are spelled out in the document. Some text has been copied from riscv-non-isa#23. One open question is whether we should emulate 64-bit intrinsics on RV32 where it is feasible. This could be useful for the sha512 intrinsics and some others.
I think the public review period for this is over now. How do we proceed? |
@pz9115 is there an ETA for the GCC patch? |
Still WIP, plan to send patches within this month. |
@cmuellner I would suggest moving forward this PR, I mean merge this first, I believe the proposal isn't come with complicated technical issue on the GCC side, just need few more time to implement. Also I am happy to accept that on GCC 14 even it's not stage 1. |
I kept it open, because I was not sure if this is still being work on. Thanks @topperc for the PR and the patience! |
This adds riscv_bitmanip and riscv_crypto.h This is based on the proposed spec here riscv-non-isa/riscv-c-api-doc#44 Tests that previously used builtins directly now use the intrinsics. Reviewed By: wangpc Differential Revision: https://reviews.llvm.org/D155647
…tins. Part of an effort to remove uses of 'long' to mean XLen from the builtin interfaces. Also makes the builtin names match riscv-non-isa/riscv-c-api-doc#44. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D154681
Allow _32 builtin on RV64 since it only brev8+sext.w. Part of an effort to remove 'long' to mean XLen from the builtin interface. Matches the proposal here riscv-non-isa/riscv-c-api-doc#44 Reviewed By: asb Differential Revision: https://reviews.llvm.org/D154683
We can use an i64 clmul to emulate i32 clmul. For clmulh and clmulr we need to zero extend the 32 bit input to 64 bits then extract either bits [63:32] or [62:31]. Unfortunately, without Zba we need to use 2 shifts for the zero extends. These can be optimized out later if the producing instruction already zeroed the upper bits or if we can use lwu. There are alternative sequences we can use for clmulh/clmulr when the zero extend isn't free, but those are best handled by a DAG combine to give the best opportunity for removing the extend. This allows us to implement i32 clmul C intrinsics proposed in riscv-non-isa/riscv-c-api-doc#44. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D154729
Unsigned is a better representation for bitmanipulation and cryptography.w The only exception being the return values for clz and ctz intrinsics is a signed int. That matches the target independent clz and ctz builtins. This is consistent with the current scalar crypto proposal riscv-non-isa/riscv-c-api-doc#44 Reviewed By: VincentWu Differential Revision: https://reviews.llvm.org/D154616
Previously we returned i32 on RV32 and i64 on RV64. The instructions only consume 32 bits and only produce 32 bits. For RV64, the result is sign extended to 64 bits like *W instructions. This patch removes this detail from the interface to improve portability and consistency. This matches the proposal for scalar intrinsics here riscv-non-isa/riscv-c-api-doc#44 I've included IR autoupgrade support as well. I'll be doing this for other builtins/intrinsics that currently use 'long' in other patches. Reviewed By: VincentWu Differential Revision: https://reviews.llvm.org/D154647
This is a proposal for scalar crypto intrinsics for the Zk* extensions.
I've removed the recommendation to use 'long' for XLEN. There's nothing guaranteeing that 'long' is XLEN. For example, if we were to support ILP32 on RV64, 'long' would become 32 bits while XLEN would still be 64 bits.
Other design decisions are spelled out in the document.
Some text has been copied from #23.
One open question is whether we should emulate 64-bit intrinsics on RV32 where it is feasible. This could be useful for the sha512 intrinsics and some others.
@cmuellner @kito-cheng @mjosaarinen