Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SpecializeDMACode] Properly lower compute_core_index #109

Merged
merged 2 commits into from
Aug 15, 2024

Conversation

zero9178
Copy link
Member

Barriers may be inserted in a loop resulting from the lowering of a scf.forall. The lowering of compute_core_index unfortunately returns num_compute_cores + 1 which is outside the specified range of the operation and leads to such loops and their barriers being skipped by the DMA core. As the pass already assumes non-divergent control flow, we can fix this by specializing compute_core_index to any integer in the defined range of the op.

The op as it is currently used and specified returns the index of compute cores (in our current simulations 0 to exclusive 8) and has unspecified behavior on the DMA core. This will be used in a later PR to properly lower it when doing DMA code specialization.
Barriers may be inserted in a loop resulting from the lowering of a `scf.forall`. The lowering of `compute_core_index` unfortunately returns `num_compute_cores + 1` which is outside the specified range of the operation and leads to such loops and their barriers being skipped by the DMA core.
As the pass already assumes non-divergent control flow, we can fix this by specializing `compute_core_index` to any integer in the defined range of the op.
Base automatically changed from compute-core-rename to main August 15, 2024 10:46
@zero9178 zero9178 merged commit 03aaad4 into main Aug 15, 2024
1 check passed
@zero9178 zero9178 deleted the dma-compute-core-index-lowering branch August 15, 2024 10:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant