-
Notifications
You must be signed in to change notification settings - Fork 30
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[DmaLoopSubsumption] Fix for strided op in loop without induction var…
… dependency (#570) Fixes an issue exposed by a 128x32x64 matmul: #556. This issue would come up in partial loop dependencies in case of `scf.forall`: ``` scf.forall (%arg2, %arg3) in (2, 6) { %1 = amdaie.npu.dma_cpy_nd %0([0, %arg3] [8, 16] [16, 1], [] [] []) amdaie.npu.dma_wait(%2, S2MM) } ``` In this case, the outer iteration upon which the NPU DMA operation didn't depend ( `%arg2`) would not be considered, resulting in incorrect output IR: ``` %1 = amdaie.npu.dma_cpy_nd %0([0, 0, 0] [6, 8, 16] [1, 16, 1], [] [] []) amdaie.npu.dma_wait(%2, S2MM) ``` However, the outer iteration should be considered as well, making sure the above DMA operation is executed twice, resulting in the below output IR: ``` %1 = amdaie.npu.dma_cpy_nd %0([0, 0, 0, 0] [2, 6, 8, 16] [0, 1, 16, 1], [] [] []) amdaie.npu.dma_wait(%2, S2MM) ``` This PR fixes the issue by adding general support for subsuming loop iterations into strided operations without any loop induction variable dependency.
- Loading branch information
Showing
4 changed files
with
268 additions
and
135 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.