-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Change codegen of LLVM intrinsics to be name-based, and add llvm linkage support for x86amx
, bf16
and i1
#140763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Some changes occurred in compiler/rustc_codegen_ssa |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Some changes occurred in compiler/rustc_codegen_gcc |
This comment has been minimized.
This comment has been minimized.
x86amx
for i32x256
for AMX intrinsics
x86amx
for i32x256
for AMX intrinsicsx86amx
and i32x256
for AMX intrinsics
This comment has been minimized.
This comment has been minimized.
I think you can use |
That can be used to improve performance, I am not really focusing on performance in this PR. I want to currently emphasize the correctness of the codegen. |
Oh wait, I probably misunderstood your comment, you meant using the llvm declaration by itself. Yeah, that would be better, thanks for the info. I will update the impl when I get the chance |
I think you can just focus on non-overloaded functions for this PR. Overloaded functions and type checking that checking Rust function signatures using LLVM defined can be subsequent PRs. @rustbot author |
Reminder, once the PR becomes ready for a review, use |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@sayantn Taking the address of an intrinsic is invalid LLVM IR. |
@nikic nice, one less thing to worry about ❤️ |
x86amx
and i32x256
for AMX intrinsicsx86amx
, bf16
and i1
Currently, LLVM intrinsics are codegen as simple function declarations, by mapping the Rust-side types to an LLVM signature in
rustc_codegen_llvm::abi
. But that is actually not needed, because the signature of LLVM intrinsics can be extracted from just its name! (well, there are complexities regarding overloaded intrinsics with named structs and target types, but for most intrinsics, it is very possible to do so).Getting the name of the function is easy when it is being declared, but at callsite the function might be behind a function pointer, which would change the output of
LLVMGetValueName2
and give us false negatives. But luckily, we are working with LLVM intrinsics, and LLVM explicitly says that creating a function pointer to an intrinsic is invalid LLVM IR (thanks @nikic for the info). There is one way we can get false positives, by using#[export_name]
with the name of an llvm intrinsic. But (again, thanks to @nikic for the info) re-defining llvm intrinsics is also an error, so we are again safe.This PR adds support for parsing the name of the LLVM intrinsic, generating its signature from its name, and barely verifying the rust-side signature against it. This can help us give more descriptive error messages than just
rustc-LLVM error
, and will help in the next part of this PR.The implementation of rust intrinsics have also been updated to take advantage of this.
Next, we use this flexibility to perfectly detect
x86amx
types in intrinsic signature, and inject casts for them in callsite (My previous attempts at this were based on heuristics of the names of the intrinsics, but this doesn't use any such heuristic).Using the same method, I will soon add support for
bf16
andi1
, but their implementation will be slightly different (there should be no future incompatibility due to this addition of bf16 to Rust,link_llvm_intrinsics
is perma-unstable). This can also be used to link againstx86_f80
andppcf128
, if it is ever required!Reviews are welcome, as this is my first time actually contributing to
rustc
Unresolved Questions
llvm.x86.cast.tile.to.vector
andllvm.x86.cast.vector.to.tile
work fine even with vectors smaller than 1024 bytes, but do we want that?i1xN
will be better? I currently have 2 in my mindbitmask
representation: represent it as just an integer withceil(N)
bits. Cons: LSB vs MSB, what to do ifN > 128
?int-vector
representation: accept any vector of integers of lengthN
, as used by portable-simd, represent1
as!0
. Cons: has "invalid" values@rustbot label O-x86_64 T-compiler
r? codegen