Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving linking support for ROCM and ukernels. #19211

Merged
merged 3 commits into from
Nov 19, 2024

Conversation

benvanik
Copy link
Collaborator

To support externally-defined ukernels on ROCM the ROCMTarget has been brought in-line with LLVMCPU/CUDA by calling linkBitcodeObjects. To make authoring passes that include object references #hal.executable.object now allows any data type to be associated so long as it is serializable allowing for external resource attrs and other custom attributes that may serialize based on other information. To allow patterns to attach object references all ops within an executable variant can now declare a hal.executable.objects array that will be hoisted and merged into the top-level variant objects after our executable linking pass (before serialization where they are used).

This is already present in CUDATarget/LLVMCPUTarget.
@benvanik benvanik added compiler/dialects Relating to the IREE compiler dialects (flow, hal, vm) codegen Shared code generation infrastructure and dialects codegen/rocm ROCm code generation compiler backend (HIP/HSA) labels Nov 19, 2024
@benvanik benvanik marked this pull request as ready for review November 19, 2024 20:26
Plugins and other dialects can define serializable attributes. For
example an `#iree_codegen.embedded_bitcode<"some_name.bc">` could
reference an embedded file via `c_embed_data`. The attributes could be
arbitrarily complex, e.g. `#cutlass.kernel<{args}>` that on-demand
generates an external kernel function to link in at compile-time.
This runs nested on variants to find all `hal.executable.objects`
attrs nested in the inner module and move them to the parent
`hal.executable.variant`. This allows codegen/plugin/etc passes running
on executable contents to declare an object they want to include by
making only local changes (such as in a pattern rewriter) and then
letting the pass move them to the variant where they belong.

This only handles arrays of objects as expected after
`MaterializeInterfacesPass` runs - target object dictionaries are not
very easy to merge and we generally want to run after executable
translation/linking anyway where they have already been baked out.
@benvanik benvanik merged commit 82a89e3 into main Nov 19, 2024
39 checks passed
@benvanik benvanik deleted the users/benvanik/rocm-linking branch November 19, 2024 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
codegen/rocm ROCm code generation compiler backend (HIP/HSA) codegen Shared code generation infrastructure and dialects compiler/dialects Relating to the IREE compiler dialects (flow, hal, vm)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants