Skip to content

Change codegen of LLVM intrinsics to be name-based, and add llvm linkage support for x86amx, bf16 and i1 #140763

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

sayantn
Copy link
Contributor

@sayantn sayantn commented May 7, 2025

Currently, LLVM intrinsics are codegen as simple function declarations, by mapping the Rust-side types to an LLVM signature in rustc_codegen_llvm::abi. But that is actually not needed, because the signature of LLVM intrinsics can be extracted from just its name! (well, there are complexities regarding overloaded intrinsics with named structs and target types, but for most intrinsics, it is very possible to do so).

Getting the name of the function is easy when it is being declared, but at callsite the function might be behind a function pointer, which would change the output of LLVMGetValueName2 and give us false negatives. But luckily, we are working with LLVM intrinsics, and LLVM explicitly says that creating a function pointer to an intrinsic is invalid LLVM IR (thanks @nikic for the info). There is one way we can get false positives, by using #[export_name] with the name of an llvm intrinsic. But (again, thanks to @nikic for the info) re-defining llvm intrinsics is also an error, so we are again safe.

This PR adds support for parsing the name of the LLVM intrinsic, generating its signature from its name, and barely verifying the rust-side signature against it. This can help us give more descriptive error messages than just rustc-LLVM error, and will help in the next part of this PR.

The implementation of rust intrinsics have also been updated to take advantage of this.

Next, we use this flexibility to perfectly detect x86amx types in intrinsic signature, and inject casts for them in callsite (My previous attempts at this were based on heuristics of the names of the intrinsics, but this doesn't use any such heuristic).

Using the same method, I will soon add support for bf16 and i1, but their implementation will be slightly different (there should be no future incompatibility due to this addition of bf16 to Rust, link_llvm_intrinsics is perma-unstable). This can also be used to link against x86_f80 and ppcf128, if it is ever required!

Reviews are welcome, as this is my first time actually contributing to rustc

Unresolved Questions

  • LLVM docs say that llvm.x86.cast.tile.to.vector and llvm.x86.cast.vector.to.tile work fine even with vectors smaller than 1024 bytes, but do we want that?
  • What representation of i1xN will be better? I currently have 2 in my mind
    • bitmask representation: represent it as just an integer with ceil(N) bits. Cons: LSB vs MSB, what to do if N > 128?
    • int-vector representation: accept any vector of integers of length N, as used by portable-simd, represent 1 as !0. Cons: has "invalid" values

@rustbot label O-x86_64 T-compiler
r? codegen

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. O-x86_64 Target: x86-64 processors (like x86_64-*) (also known as amd64 and x64) labels May 7, 2025
@rustbot
Copy link
Collaborator

rustbot commented May 8, 2025

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

@sayantn

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rustbot
Copy link
Collaborator

rustbot commented May 8, 2025

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

@rust-log-analyzer

This comment has been minimized.

@sayantn sayantn changed the title Add auto-bitcasts from/to x86amx and i32x256 for AMX intrinsics Add auto-bitcasts from/to x86amx for i32x256 for AMX intrinsics May 8, 2025
@sayantn sayantn changed the title Add auto-bitcasts from/to x86amx for i32x256 for AMX intrinsics Add auto-bitcasts between x86amx and i32x256 for AMX intrinsics May 8, 2025
@sayantn

This comment has been minimized.

@dianqk
Copy link
Member

dianqk commented May 9, 2025

I think you can use LLVMGetIntrinsicDeclaration, LLVMGetIntrinsicDeclaration or some functions in Intrinsic.h in declare_raw_fn, as a reference: https://github.com/llvm/llvm-project/blob/d35ad58859c97521edab7b2eddfa9fe6838b9a5e/llvm/lib/AsmParser/LLParser.cpp#L330-L335.

@sayantn
Copy link
Contributor Author

sayantn commented May 9, 2025

That can be used to improve performance, I am not really focusing on performance in this PR. I want to currently emphasize the correctness of the codegen.

@sayantn
Copy link
Contributor Author

sayantn commented May 9, 2025

Oh wait, I probably misunderstood your comment, you meant using the llvm declaration by itself. Yeah, that would be better, thanks for the info. I will update the impl when I get the chance

@dianqk
Copy link
Member

dianqk commented May 15, 2025

Oh wait, I probably misunderstood your comment, you meant using the llvm declaration by itself. Yeah, that would be better, thanks for the info. I will update the impl when I get the chance

I think you can just focus on non-overloaded functions for this PR. Overloaded functions and type checking that checking Rust function signatures using LLVM defined can be subsequent PRs.

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 15, 2025
@rustbot
Copy link
Collaborator

rustbot commented May 15, 2025

Reminder, once the PR becomes ready for a review, use @rustbot ready.

@sayantn

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@sayantn sayantn marked this pull request as draft May 19, 2025 07:23
@nikic
Copy link
Contributor

nikic commented May 19, 2025

@sayantn Taking the address of an intrinsic is invalid LLVM IR.

@sayantn
Copy link
Contributor Author

sayantn commented May 19, 2025

@nikic nice, one less thing to worry about ❤️

@sayantn sayantn changed the title Add auto-bitcasts between x86amx and i32x256 for AMX intrinsics Change codegen of LLVM intrinsics to be name-based, and add llvm linkage support for x86amx, bf16 and i1 May 20, 2025
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@sayantn
Copy link
Contributor Author

sayantn commented May 20, 2025

The job mingw-check-tidy failed! Check out the build log: (web) (plain)
Click to see the possible cause of the failure (guessed by this bot)

How is this even possible? I have the push hook installed! 😆

@rust-log-analyzer

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-x86_64 Target: x86-64 processors (like x86_64-*) (also known as amd64 and x64) S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants